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Notation 


Except where actual units are needed, units are such that the speed of light is one, c = 1, and Newton’s 
gravitational constant is one, G = 1. 

The metric signature is —+++. 

Greek (brown) letters «, A, ..., denote spacetime (4D, usually) coordinate indices. Latin (black) letters k, 
l, ..., denote spacetime (4D, usually) tetrad indices. Early-alphabet greek letters a, 8, ... denote spatial (3D, 
usually) coordinate indices. Early-alphabet latin letters a, b, ... denote spatial (3D, usually) tetrad indices. 
To avoid distraction, colouring is applied only to coordinate indices, not to the coordinates themselves. 
Early-alphabet latin letters a, b, ... are also used to denote spinor indices. 

Sequences of indices, as encountered in multivectors (Chapter 13) and differential forms (Chapter 15), are 
denoted by capital letters. Greek (brown) capital letters A, II, ... denote sequences of spacetime (4D, usually) 
coordinate indices. Latin (black) capital letters K, L, ... denote sequences of spacetime (4D, usually) tetrad 
indices. Early-alphabet capital letters denote sequences of spatial (3D, usually) indices, coloured brown A, 
B, ... for coordinate indices, and black A, B, ... for tetrad indices. 

Specific (non-dummy) components of a vector are labelled by the corresponding coordinate (brown) or 
tetrad (black) direction, for example A“ = {A‘, A”, A”, A*} or A™ = {At, A”, AY, A7}. Sometimes it is 
convenient to use numerical indices, as in A“ = {A°, A!, A?, A?} or A™ = {A°, At, A?, A3}. Allowing the 
same label to denote either a coordinate or a tetrad index risks ambiguity, but it should be apparent from 
the context (or colour) what is meant. Some texts distinguish coordinate and tetrad indices for example by 
a caret on the latter (there is no widespread convention), but this produces notational overload. 

Boldface denotes abstract vectors, in either 3D or 4D. In 4D, A = A”“e, = A™ Ym, where e, denote 
coordinate tangent axes, and Ym denote tetrad axes. 

Repeated paired dummy indices are summed over, the implicit summation convention. In special and 
general relativity, one index of a pair must be up (contravariant), while the other must be down (covariant). 
If the space being considered is Euclidean, then both indices may be down. 

ð/ðx” denotes coordinate partial derivatives, which commute. m denotes tetrad directed derivatives, 
which do not commute. D, and Dm denote respectively coordinate-frame and tetrad-frame covariant deriva- 
tives. 


Notation 3 
Choice of metric signature 


There is a tendency, by no means unanimous, for general relativists to prefer the —++-+ metric signature, 
while particle physicists prefer +———. 

For someone like me who does general relativistic visualization, there is no contest: the choice has to be 
—+++, so that signs remain consistent between 3D spatial vectors and 4D spacetime vectors. For example, 
the 3D industry knows well that quaternions provide the most efficient and powerful way to implement 
spatial rotations. As shown in Chapter 13, complex quaternions provide the best way to implement Lorentz 
transformations, with the subgroup of real quaternions continuing to provide spatial rotations. Compatibility 
requires —+++. Actually, OpenGL and other graphics languages put spatial coordinates in the first three 
indices, leaving time to occupy the fourth index; but in these notes I stick to the physics convention of 
putting time in the zeroth index. 

In practical calculations it is convenient to be able to switch transparently between boldface and in- 
dex notation in both 3D and 4D contexts. This is where the +——-— signature poses greater potential for 
misinterpretation in 3D. For example, with this signature, what is the sign of the 3D scalar product 


a-b? 


Is it a.-b = eee aab? or a-b = ee a*b*? To be consistent with common 3D usage, it must be the 
latter. With the +——-— signature, it must be that a-b = —a,b", where the repeated indices signify implicit 
summation over spatial indices. So you have to remember to introduce a minus sign in switching between 
boldface and index notation. 

As another example, what is the sign of the 3D vector product 


axb? 


Is it axb = > any Eqbea? bo or a xb = ee = Elpa? b or axb = D ei e%g>be? Well, if you want to switch 
transparently between boldface and index notation, and you decide that you want boldface consistently to 
signify a vector with a raised index, then maybe you’d choose the middle option. To be consistent with 
standard 3D convention for the sign of the vector product, maybe you’d choose £fe to have positive sign for 
abc an even permutation of xyz. 

Finally, what is the sign of the 3D spatial gradient operator 


V= e ? 
Ox 
Is it V = 0/0x* or V = 0/024? Convention dictates the former, in which case it must be that some boldface 
3D vectors must signify a vector with a raised index, and others a vector with a lowered index. Oh dear. 


PART ONE 


FUNDAMENTALS 


Concept Questions 


What does c = universal constant mean? What is speed? What is distance? What is time? 


. c+c=c. How can that be possible? 


3. The first postulate of special relativity asserts that spacetime forms a 4-dimensional continuum. The 
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17. 


fourth postulate of special relativity asserts that spacetime has no absolute existence. Isn’t that a 
contradiction? 

The principle of special relativity says that there is no absolute spacetime, no absolute frame of reference 
with respect to which position and velocity are defined. Yet does not the cosmic microwave background 
define such a frame of reference? 

How can two people moving relative to each other at near c both think each other’s clock runs slow? 
How can two people moving relative to each other at near c both think the other is Lorentz-contracted? 
All paradoxes in special relativity have the same solution. In one word, what is that solution? 

All conceptual paradoxes in special relativity can be understood by drawing what kind of diagram? 
Your twin takes a trip to œ Cen at near c, then returns to Earth at near c. Meeting your twin, you see 
that the twin has aged less than you. But from your twin’s perspective, it was you that receded at near 
c, then returned at near c, so your twin thinks you aged less. Is it true? 

Blobs in the jet of the galaxy M87 have been tracked by the Hubble Space Telescope to be moving at 
about 6c. Does this violate special relativity? 

If you watch an object move at near c, does it actually appear Lorentz-contracted? Explain. 

You speed towards the centre of our Galaxy, the Milky Way, at near c. Does the centre appear to you 
closer or farther away? 

You go on a trip to the centre of the Milky Way, 30,000 lightyears distant, at near c. How long does the 
trip take you? 

You surf a light ray from a distant quasar to Earth. How much time does the trip take, from your 
perspective? 

If light is a wave, what is waving? 

As you surf the light ray, how fast does it appear to vibrate? 

How does the phase of a light ray vary along the light ray? Draw surfaces of constant phase on a 
spacetime diagram. 
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Concept Questions 


You see a distant galaxy at a redshift of z = 1. If you could see a clock on the galaxy, how fast would 
the clock appear to tick? Could this be tested observationally? 


. You take a trip to a Cen at near c, then instantaneously accelerate to return at near c. If you are 


looking through a telescope at a clock on the Earth while you instantaneously accelerate, what do you 
see happen to the clock? 


. In what sense is time an imaginary spatial dimension? 
. In what sense is a Lorentz boost a rotation by an imaginary angle? 
. You know what it means for an object to be rotating at constant angular velocity. What does it mean 


for an object to be boosting at a constant rate? 


. A wheel is spinning so that its rim is moving at near c. The rim is Lorentz-contracted, but the spokes 


are not. How can that be? 


. You watch a wheel rotate at near the speed of light. The spokes appear bent. How can that be? 

. Does a sunbeam appear straight or bent when you pass by it at near the speed of light? 

. Energy and momentum are unified in special relativity. Explain. 

. In what sense is mass equivalent to energy in special relativity? In what sense is mass different from 


energy? 


. Why is the Minkowski metric unchanged by a Lorentz transformation? 
. What is the best way to program Lorentz transformations on a computer? 


What’s important? 


. The postulates of special relativity. 


2. Understanding conceptually the unification of space and time implied by special relativity. 


a. Spacetime diagrams. 

b. Simultaneity. 

c. Understanding the paradoxes of relativity — time dilation, Lorentz contraction, the twin paradox. 
. The mathematics of spacetime transformations. 

a. Lorentz transformations. 

Invariant spacetime distance. 

Minkowski metric. 

4-vectors. 


phap 


Energy-momentum 4-vector. E = mc?. 


f. The energy-momentum 4-vector of massless particles, such as photons. 
. What things look like at relativistic speeds. 


Special Relativity 


Special relativity is a fundamental building block of general relativity. General relativity postulates that the 
local structure of spacetime is that of special relativity. 

The primary goal of this Chapter is to convey a clear conceptual understanding of special relativity. 
Everyday experience gives the impression that time is absolute, and that space is entirely distinct from time, 
as Galileo and Newton postulated. Special relativity demands, in apparent contradiction to experience, the 
revolutionary notion that space and time are united into a single 4-dimensional entity, called spacetime. 
The revolution forces conclusions that appear paradoxical: how can two people moving relative to each other 
both measure the speed of light to be the same, both think each other’s clock runs slow, and both think the 
other is Lorentz-contracted? 

In fact special relativity does not contradict everyday experience. It is just that we humans move through 
our world at speeds that are so much smaller than the speed of light that we are not aware of relativistic 
effects. The correctness of special relativity is confirmed every day in particle accelerators that smash particles 
together at highly relativistic speeds. 

See https: //jila.colorado.edu/~ajsh/sr/ for animated versions of several of the diagrams in this Chapter. 


1.1 Motivation 


The history of the development of special relativity is rich and human, and it is beyond the intended scope 
of this book to give any reasonable account of it. If you are interested in the history, I recommend starting 
with the popular account by Thorne (1994). 

As first proposed by James Clerk Maxwell in 1864, light is an electromagnetic wave. Maxwell believed 
(Goldman, 1984) that electromagnetic waves must be carried by some medium, the luminiferous aether, 
just as sound waves are carried by air. However, Maxwell knew that his equations of electromagnetism had 
empirical validity without any need for the hypothesis of an aether. 

For Albert Einstein, the theory of special relativity was motivated by the curious circumstance that 
Maxwell’s equations of electromagnetism seemed to imply that the speed of light was independent of the 
motion of an observer. Others before Einstein had noticed this curious feature of Maxwell’s equations. Joseph 
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Larmor, Hendrick Lorentz, and Henri Poincaré all noticed that the form of Maxwell’s equations could be 
preserved if lengths and times measured by an observer were somehow altered by motion through the aether. 
The transformations of special relativity were discovered before Einstein by Lorentz (1904), the name “Lorentz 
transformations” being conferred by Poincaré (1905). 

Einstein’s great contribution was to propose (Einstein, 1905) that there was no aether, no absolute space- 
time. From this simple and profound idea stemmed his theory of special relativity. 


1.2 The postulates of special relativity 


The theory of special relativity can be derived formally from a small number of postulates: 

1. Space and time form a 4-dimensional continuum; 

2. The existence of globally inertial frames; 

3. The speed of light is constant; 

4. The principle of special relativity. 
The first two postulates are assertions about the structure of spacetime, while the last two postulates form 
the heart of special relativity. Most books mention just the last two postulates, but I think it is important 
to know that special (and general) relativity simply postulate the 4-dimensional character of spacetime, and 
that special relativity postulates moreover that spacetime is flat. 


1. Space and time form a 4-dimensional continuum. The correct mathematical word for continuum 
is manifold. A 4-dimensional manifold is defined mathematically to be a topological space that is locally 
homeomorphic to Euclidean 4-space Rt. 

The postulate that spacetime forms a 4-dimensional continuum is a generalization of the classical Galilean 
concept that space and time form separate 3 and 1 dimensional continua. The postulate of a 4-dimensional 
spacetime continuum is retained in general relativity. 

Physicists widely believe that this postulate must ultimately break down, that space and time are quantized 
over extremely small intervals of space and time, the Planck length \/Gh/c? ~ 10735 m, and the Planck time 
VGh/c? ~ 1074 s, where G is Newton’s gravitational constant, A = h/(27) is Planck’s constant divided by 
2r, and c is the speed of light. 


2. The existence of globally inertial frames. Statement: “There exist global spacetime frames with 
respect to which unaccelerated objects move in straight lines at constant velocity.” 

A spacetime frame is a system of coordinates for labelling space and time. Four coordinates are needed, 
because spacetime is 4-dimensional. A frame in which unaccelerated objects move in straight lines at con- 
stant velocity is called an inertial frame. One can easily think of non-inertial frames: a rotating frame, an 
accelerating frame, or simply a frame with some bizarre Dahlian labelling of coordinates. 

A globally inertial frame is an inertial frame that covers all of space and time. The postulate that 
globally inertial frames exist is carried over from classical mechanics (Newton’s first law of motion). 
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Notice the subtle shift from the Newtonian perspective. The postulate is not that particles move in straight 
lines, but rather that there exist spacetime frames with respect to which particles move in straight lines. 

Implicit in the assumption of the existence of globally inertial frames is the assumption that the geometry of 
spacetime is flat, the geometry of Euclid, where parallel lines remain parallel to infinity. In general relativity, 
this postulate is replaced by the weaker postulate that local (not global) inertial frames exist. A locally 
inertial frame is one which is inertial in a “small neighbourhood” of a spacetime point. In general relativity, 
spacetime can be curved. 


3. The speed of light is constant. Statement: “The speed of light c is a universal constant, the same in 
any inertial frame.” 

This postulate is the nub of special relativity. The immediate challenge of this Chapter, §1.3, is to confront 
its paradoxical implications, and to resolve them. 

Measuring speed requires being able to measure intervals of both space and time: speed is distance travelled 
divided by time elapsed. Inertial frames constitute a special class of spacetime coordinate systems; it is with 
respect to distance and time intervals in these special frames that the speed of light is asserted to be constant. 

In general relativity, arbitrarily weird coordinate systems are allowed, and light need move neither in 
straight lines nor at constant velocity with respect to bizarre coordinates (why should it, if the labelling 
of space and time is totally arbitrary?). However, general relativity asserts the existence of locally inertial 
frames, and the speed of light is a universal constant in those frames. 

In 1983, the General Conference on Weights and Measures officially defined the speed of light to be 


c = 299,792,458 ms™}, (1.1) 


and the metre, instead of being a primary measure, became a secondary quantity, defined in terms of the 
second and the speed of light. 


4. The principle of special relativity. Statement: “The laws of physics are the same in any inertial frame, 
regardless of position or velocity.” 

Physically, this means that there is no absolute spacetime, no absolute frame of reference with respect to 
which position and velocity are defined. Only relative positions and velocities between objects are meaningful. 

Mathematically, the principle of special relativity requires that the equations of special relativity be 
Lorentz covariant. 

It is to be noted that the principle of special relativity does not imply the constancy of the speed of light, 
although the postulates are consistent with each other. Moreover the constancy of the speed of light does 
not imply the Principle of Special Relativity, although for Einstein the former appears to have been the 
inspiration for the latter. 

An example of the application of the principle of special relativity is the construction of the energy- 
momentum 4-vector of a particle, which should have the same form in any inertial frame (§1.11). 
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1.3 The paradox of the constancy of the speed of light 


The postulate that the speed of light is the same in any inertial frame leads immediately to a paradox. 
Resolution of this paradox compels a revolution in which space and time are united from separate 3 and 
1-dimensional continua into a single 4-dimensional continuum. 

Figure 1.1 shows Vermilion emitting a flash of light, which expands away from her in all directions. 
Vermilion thinks that the light moves outward at the same speed in all directions. So Vermilion thinks that 
she is at the centre of the expanding sphere of light. 

Figure 1.1 shows also Cerulean, who is moving away from Vermilion at about half the speed of light. But, 
says special relativity, Cerulean also thinks that the light moves outward at the same speed in all directions 
from him. So Cerulean should be at the centre of the expanding light sphere too. But he’s not, is he. Paradox! 
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Figure 1.1 Vermilion emits a flash of light, which (from left to right) expands away from her in all directions. Since 
the speed of light is constant in all directions, she finds herself at the centre of the expanding sphere of light. Cerulean 
is moving to the right at half of the speed of light relative to Vermilion. Special relativity declares that Cerulean too 
thinks that the speed of light is constant in all directions. So should not Cerulean think that he too is at the centre 
of the expanding sphere of light? Paradox! 


Concept question 1.1. Does light move differently depending on who emits it? Would the light 
have expanded differently if Cerulean had emitted the light? 


Exercise 1.2. Challenge problem: the paradox of the constancy of the speed of light. Can you 
figure out a solution to the paradox? Somehow you have to arrange that both Vermilion and Cerulean regard 
themselves as being in the centre of the expanding sphere of light. 


1.3.1 Spacetime diagram 


A spacetime diagram suggests a way of thinking, first advocated by Minkowski (1909), that leads to the 
solution of the paradox of the constancy of the speed of light. Indeed, spacetime diagrams provide the way 
to resolve all conceptual paradoxes in special relativity, so it is thoroughly worthwhile to understand them. 

A spacetime diagram, Figure 1.2, is a diagram in which the vertical axis represents time, while the 
horizontal axis represents space. Really there are three dimensions of space, which can be thought of as 
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Figure 1.2 A spacetime diagram shows events in space and time. In a spacetime diagram, time goes upward, while 
space dimensions are horizontal. Really there should be 3 space dimensions, but usually it suffices to show 1 spatial 
dimension, as here. In a spacetime diagram, the units of space and time are chosen so that light goes one unit of 
distance in one unit of time, i.e. the units are such that the speed of light is one, c = 1. Thus light moves upward and 
outward at 45° from vertical in a spacetime diagram. 


filling additional horizontal dimensions. But for simplicity a spacetime diagram usually shows just one spatial 
dimension. 

In a spacetime diagram, the units of space and time are chosen so that light goes one unit of distance in 
one unit of time, i.e. the units are such that the speed of light is one, c = 1. Thus light always moves upward 
at 45° from vertical in a spacetime diagram. Each point in 4-dimensional spacetime is called an event. Light 


Time 


/ 


Figure 1.3 Spacetime diagram of Vermilion emitting a flash of light. This is a spacetime diagram version of the 
situation illustrated in Figure 1.1. The lines along which Vermilion and Cerulean move through spacetime are called 
their worldlines. Each point in 4-dimensional spacetime is called an event. Light signals converging to or expanding 
from an event follow a 3-dimensional hypersurface called the lightcone. In the diagram, the sphere of light expanding 
from the emission event is following the future lightcone. There is also a past lightcone, not shown here. 
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signals converging to or expanding from an event follow a 3-dimensional hypersurface called the lightcone. 
Light converging on to an event in on the past lightcone, while light emerging from an event is on the 
future lightcone. 

Figure 1.3 shows a spacetime diagram of Vermilion emitting a flash of light, and Cerulean moving relative 
to Vermilion at about l the speed of light. This is a spacetime diagram version of the situation illustrated in 
Figure 1.1. The lines along which Vermilion and Cerulean move through spacetime are called their world- 
lines. 

Consider again the challenge problem. The problem is to arrange that both Vermilion and Cerulean are 
at the centre of the lightcone, from their own points of view. 

Here’s a clue. Cerulean’s concept of space and time may not be the same as Vermilion’s. 


1.3.2 Centre of the lightcone 


The solution to the paradox is that Cerulean’s spacetime is skewed compared to Vermilion’s, as illustrated 
by Figure 1.4. The thing to notice in the diagram is that Cerulean is in the centre of the lightcone, according 
to the way Cerulean perceives space and time. Vermilion remains at the centre of the lightcone according 
to the way Vermilion perceives space and time. In the diagram Vermilion and her space are drawn at one 
“tick” of her clock past the point of emission, and likewise Cerulean and his space are drawn at one “tick” of 
his identical clock past the point of emission. Of course, from Cerulean’s point of view his spacetime is quite 
normal, and it is Vermilion’s spacetime that is skewed. 

In special relativity, the transformation between the spacetime frames of two inertial observers is called a 


Figure 1.4 The solution to how both Vermilion and Cerulean can consider themselves to be at the centre of the 
lightcone. Cerulean’s spacetime is skewed compared to Vermilion’s. Cerulean is in the centre of the lightcone, according 
to the way Cerulean perceives space and time, while Vermilion remains at the centre of the lightcone according to the 
way Vermilion perceives space and time. In the diagram Vermilion (red) and her space are drawn at one “tick” of her 
clock past the point of emission, and likewise Cerulean (blue) and his space are drawn at one “tick” of his identical 
clock past the point of emission. 
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Lorentz transformation. In general, a Lorentz transformation consists of a spatial rotation about some 
spatial axis, combined with a Lorentz boost by some velocity in some direction. 

Only space along the direction of motion gets skewed with time. Distances perpendicular to the direction 
of motion remain unchanged. Why must this be so? Consider two hoops which have the same size when at 
rest relative to each other. Now set the hoops moving towards each other. Which hoop passes inside the 
other? Neither! For suppose Vermilion thinks Cerulean’s hoop passed inside hers; by symmetry, Cerulean 
must think Vermilion’s hoop passed inside his; but both cannot be true; the only possibility is that the hoops 
remain the same size in directions perpendicular to the direction of motion. 

If you have understood all this, then you have understood the crux of special relativity, and you can 
now go away and figure out all the mathematics of Lorentz transformations. The mathematical problem is: 
what is the relation between the spacetime coordinates {t, x, y, z} and {t’,2’, y’, 2’} of a spacetime interval, 
a 4-vector, in Vermilion’s versus Cerulean’s frames, if Cerulean is moving relative to Vermilion at velocity v 
in, say, the x direction? The solution follows from requiring 

1. that both observers consider themselves to be at the centre of the lightcone, as illustrated by Figure 1.4, 

and 


2. that distances perpendicular to the direction of motion remain unchanged, as illustrated by Figure 1.5. 
An alternative version of the second condition is that a Lorentz transformation at velocity v followed by a 
Lorentz transformation at velocity —v should yield the unit transformation. 

Note that the postulate of the existence of globally inertial frames implies that Lorentz transformations 
are linear, that straight lines (4-vectors) in one inertial spacetime frame transform into straight lines in other 
inertial frames. 

You will solve this problem in the next section but two, §1.6. As a prelude, the next two sections, §1.4 and 
§1.5 discuss simultaneity and time dilation. 


Figure 1.5 Same as Figure 1.4, but with Cerulean moving into the page instead of to the right. This is just Figure 1.4 
spatially rotated by 90° in the horizontal plane. Distances perpendicular to the direction of motion are unchanged. 
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1.4 Simultaneity 


Most (all?) of the apparent paradoxes of special relativity arise because observers moving at different velocities 
relative to each other have different notions of simultaneity. 


1.4.1 Operational definition of simultaneity 


How can simultaneity, the notion of events occurring at the same time at different places, be defined opera- 
tionally? 

One way is illustrated in the sequences of spacetime diagrams in Figure 1.6. Vermilion surrounds herself 
with a set of mirrors, equidistant from Vermilion. She sends out a flash of light, which reflects off the mirrors 
back to Vermilion. How does Vermilion know that the mirrors are all the same distance from her? Because the 
reflected flash from the mirrors arrives back to Vermilion all at the same instant. Vermilion asserts that the 
light flash must have hit all the mirrors simultaneously. Vermilion also asserts that the instant when the light 
hit the mirrors must have been the instant, as registered by her wristwatch, precisely half way between the 
moment she emitted the flash and the moment she received it back again. If it takes, say, 2 seconds between 
flash and receipt, then Vermilion concludes that the mirrors are 1 lightsecond away from her. The spatial 
hyperplane passing through these events is a hypersurface of simultaneity. More generally, from Vermilion’s 
perspective, each horizontal hyperplane in the spacetime diagram is a hypersurface of simultaneity. 

Cerulean defines surfaces of simultaneity using the same operational setup: he encompasses himself with 
mirrors, arranging them so that a flash of light returns from them to him all at the same instant. But whereas 
Cerulean concludes that his mirrors are all equidistant from him and that the light bounces off them all at the 
same instant, Vermilion thinks otherwise. From Vermilion’s point of view, the light bounces off Cerulean’s 
mirrors at different times and moreover at different distances from Cerulean, as illustrated in Figure 1.7. 
Only so can the speed of light be constant, as Vermilion sees it, and yet the light return to Cerulean all at 
the same instant. 

Of course from Cerulean’s point of view all is fine: he thinks his mirrors are equidistant from him, and 


Time Time Time Time Time 


Figure 1.6 How Vermilion defines hypersurfaces of simultaneity. She surrounds herself with (green) mirrors all at the 
same distance. She sends out a light beam, which reflects off the mirrors, and returns to her all at the same moment. 
She knows that the mirrors are all at the same distance precisely because the light returns to her all at the same 
moment. The events where the light bounced off the mirrors defines a hypersurface of simultaneity for Vermilion. 
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Figure 1.7 Cerulean defines hypersurfaces of simultaneity using the same operational setup as Vermilion: he bounces 
light off (green) mirrors all at the same distance from him, arranging them so that the light returns to him all at the 
same time. But from Vermilion’s frame, Cerulean’s experiment looks skewed, as shown here. 


that the light bounces off them all at the same instant. The inevitable conclusion is that Cerulean must 
measure space and time along axes that are skewed relative to Vermilion’s. Events that happen at the same 
time according to Cerulean happen at different times according to Vermilion; and vice versa. Cerulean’s 
hypersurfaces of simultaneity are not the same as Vermilion’s. 

From Cerulean’s point of view, Cerulean remains always at the centre of the lightcone. Thus for Cerulean, 
as for Vermilion, the speed of light is constant, the same in all directions. 


1.5 Time dilation 


Vermilion and Cerulean construct identical clocks, Figure 1.8, consisting of a light beam which bounces off a 
mirror. Tick, the light beam hits the mirror, tock, the beam returns to its owner. As long as Vermilion and 
Cerulean remain at rest relative to each other, both agree that each other’s clock tick-tocks at the same rate 
as their own. 

But now suppose Cerulean goes off at velocity v relative to Vermilion, in a direction perpendicular to the 
direction of the mirror. A far as Cerulean is concerned, his clock tick-tocks at the same rate as before, a tick 
at the mirror, a tock on return. But from Vermilion’s point of view, although the distance between Cerulean 
and his mirror at any instant remains the same as before, the light has farther to go. And since the speed 
of light is constant, Vermilion thinks it takes longer for Cerulean’s clock to tick-tock than her own. Thus 
Vermilion thinks Cerulean’s clock runs slow relative to her own. 
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Figure 1.8 Vermilion and Cerulean construct identical clocks, consisting of a light beam that bounces off a (green) 
mirror and returns to them. In the left panel, Cerulean is at rest relative to Vermilion. They both agree that their 
clocks are identical. In the middle panel, Cerulean is moving to the right at speed v relative to Vermilion. The vertical 
distance to the mirror is unchanged by Cerulean’s motion in a direction orthogonal to the direction to the mirror. 
Whereas Cerulean thinks his clock ticks at the usual rate, Vermilion sees the path of the light taken by Cerulean’s 
clock is longer, by a factor y, than the path of light taken by her own clock. Since the speed of light is constant, 
Vermilion thinks Cerulean’s clock takes longer to tick, by a factor y, than her own. The sides of the triangle formed 
by the distance 1 to the mirror, the length y of the lightpath to Cerulean’s clock, and the distance yv travelled by 
Cerulean, form a right-angled triangle, illustrated in the right panel. 


1.5.1 Lorentz gamma factor 


How much slower does Cerulean’s clock run, from Vermilion’s point of view? In special relativity the factor 
is called the Lorentz gamma factor y, introduced by the Dutch physicist Hendrik A. Lorentz in 1904, one 
year before Einstein proposed his theory of special relativity. 

In units where the speed of light is one, c = 1, Vermilion’s mirror in Figure 1.8 is one tick away from her, 
and from her point of view the vertical distance between Cerulean and his mirror is the same, one tick. But 
Vermilion thinks that the distance travelled by the light beam between Cerulean and his mirror is y ticks. 
Cerulean is moving at speed v, so Vermilion thinks he moves a distance of yv ticks during the y ticks of time 
taken by the light to travel from Cerulean to his mirror. Thus, from Vermilion’s point of view, the vertical 
line from Cerulean to his mirror, Cerulean’s light beam, and Cerulean’s path form a triangle with sides 1, 
y, and yv, as illustrated in Figure 1.8. Pythogoras’ theorem implies that 


P+ (yw)? =7?. (1.2) 


From this it follows that the Lorentz gamma factor y is related to Cerulean’s velocity v by 


(1.3) 


which is Lorentz’s famous formula. 


1.6 Lorentz transformation 


A Lorentz transformation is a rotation of space and time. Lorentz transformations form a 6-dimensional 
group, with 3 dimensions from spatial rotations, and 3 dimensions from Lorentz boosts. 
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If you wish to understand special relativity mathematically, then it is essential for you to go through the 
exercise of deriving the form of Lorentz transformations for yourself. Indeed, this problem is the challenge 
problem posed in §1.3, recast as a mathematical exercise. For simplicity, it is enough to consider the case of 
a Lorentz boost by velocity v along the x-axis. 

You can derive the form of a Lorentz transformation either pictorially (geometrically), or algebraically. 
Ideally you should do both. 


Figure 1.9 Spacetime diagram representing the experiments shown in Figures 1.6 and 1.7. The right panel shows a 
detail of how the spacetime diagram can be drawn using only a straight edge and a compass. If Cerulean’s position is 
drawn first, then Vermilion’s position follows from drawing the arc as shown. 


Exercise 1.3. Pictorial derivation of the Lorentz transformation. Construct, with ruler and compass, 
a spacetime diagram that looks like the one in Figure 1.9. You should recognize that the square represents the 
paths of lightrays that Vermilion uses to define a hypersurface of simultaneity, while the rectangle represents 
the same thing for Cerulean. Notice that Cerulean’s worldline and line of simultaneity are diagonals along his 
light rectangle, so the angles between those lines and the lightcone are equal. Notice also that the areas of the 
square and the rectangle are the same, which expresses the fact that the area is multiplied by the determinant 
of the Lorentz transformation matrix, which must be one (why?). Use your geometric construction to derive 
the mathematical form of the Lorentz transformation. 


Exercise 1.4. 3D model of the Lorentz transformation. Make a 3D spacetime diagram of the Lorentz 
transformation, something like that in Figure 1.4, with not only an x-dimension, as in Exercise 1.3, but also 
a y-dimension. You can use a 3D computer modelling program, or you can make a real 3D model. Make the 
lightcone from flexible paperboard, the spatial hypersurface of simultaneity from stiff paperboard, and the 
worldline from wooden dowel. 


Exercise 1.5. Mathematical derivation of the Lorentz transformation. Relative to person A (Ver- 
milion, unprimed frame), person B (Cerulean, primed frame) moves at velocity v along the z-axis. Derive 
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the form of the Lorentz transformation between the coordinates (t, x, y, z) of a 4-vector in A’s frame and the 
corresponding coordinates (t’, 2’, y’, 2’) in B’s frame from the assumptions: 


1. 
2. 
3. 


that the transformation is linear; 

that the spatial coordinates in the directions orthogonal to the direction of motion are unchanged; 
that the speed of light c is the same for both A and B, so that x = t in A’s frame transforms to 2’ = t 
in B’s frame, and likewise x = —t in A’s frame transforms to x’ = —t’ in B’s frame; 

the definition of speed; if B is moving at speed v relative to A, then z = vt in A’s frame transforms to 
x’ = 0 in B’s frame; 

spatial isotropy; specifically, show that if A thinks B is moving at velocity v, then B must think that A 
is moving at velocity —v, and symmetry (spatial isotropy) between these two situations then fixes the 
Lorentz y factor. 


Your logic should be precise, and explained in clear, concise English. 


You should find that the Lorentz transformation for a Lorentz boost by velocity v along the x-axis is 


t = yt- yur t = yt +yv2' 
vg = —yut+yr r = yt ++yr' 
y = y ' y=y 04) 
Bhs oz Zz x 
The transformation can be written more elegantly in matrix notation: 
t y -w 0 0 t 
a! -yw y 0 0 x 
a ae (ee o 0 10 i (a 
y y 
z' 0 0 0 1 z 
with inverse 
t y w 0 0 r 
x w y 0 0 av! (1.6) 
y 0 0 10 y’ i 
z 0 0 01 z 


A Lorentz transformation at velocity v followed by a Lorentz transformation at velocity v in the opposite 
direction, i.e. at velocity —v, yields the unit transformation, as it should: 


y y 0 0 y —yw 0 0 1000 

yw y 00 -yw y 00 — 0 100 (1.7) 
0 0 1 0 0 0 10 00 1 0 
0 0 0 1 0 0 0 1 000 1 
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The determinant of the Lorentz transformation is one, as it should be: 


y —yw 0 0 
-yw y 00 
0 0 10 
0 0 0 1 


=7(1—v7)=1. (1.8) 


Indeed, requiring that the determinant be one provides another derivation of the formula (1.3) for the Lorentz 
gamma factor. 


Concept question 1.6. Determinant of a Lorentz transformation. Why must the determinant of a 
Lorentz transformation be one? 


1.7 Paradoxes: Time dilation, Lorentz contraction, and the Twin paradox 


There are several classic paradoxes in special relativity. One of them has already been met above, the paradox 
of the constancy of the speed of light in §1.3. This section collects three famous paradoxes: time dilation, 
Lorentz contraction, and the Twin paradox. 

If you wish to understand special relativity conceptually, then you should work through all these paradoxes 
yourself. As remarked in §1.4, most (all?) paradoxes in special relativity arise because different observers 
have different notions of simultaneity, and most (all?) paradoxes can be solved using spacetime diagrams. 

The Twin paradox is particularly helpful because it illustrates several different facets of special relativity, 
not only time dilation, but also how light travel time modifies what an observer actually sees. 


1.7.1 Time dilation 


If a timelike interval {t,r} corresponds to motion at velocity v, then r = vt. The proper time along the 
interval is 


ra VP oP ati ==. (1.9) 


This is Lorentz time dilation: the proper time interval 7 experienced by a moving person is a factor y less 
than the time interval t according to an onlooker. 


1.7.2 Fitzgerald-Lorentz contraction 


Consider a rocket of proper length l, so that in the rocket’s own rest frame (primed) the back and front ends 
of the rocket move through time t’ with coordinates 


{t,x} = {t',0} and {t1}. (1.10) 
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From the perspective of an observer who sees the rocket move at velocity v in the x-direction, the worldlines 
of the back and front ends of the rocket are at 


{t,x} = {yt yut} and {yt + yul, yut + yl}. (1.11) 


However, the observer measures the length of the rocket simultaneously in their own frame, not the rocket 
frame. Solving for yt’ = t at the back and yt’ + yul = t at the front gives 


{t,x} = {t,vt} and fisut + “| (1.12) 


which says that the observer measures the front end of the rocket to be a distance l/y ahead of the back 
end. This is Lorentz contraction: an object of proper length l is measured by a moving person to be shorter 
by a factor 7. 


Exercise 1.7. Time dilation. On a spacetime diagram such as that in the left panel of Figure 1.10, show 
how two observers moving relative to each other can both consider the other’s clock to run slow compared 
to their own. 
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Figure 1.10 (Left) Time dilation, and (right) Lorentz contraction spacetime diagrams. 


Exercise 1.8. Lorentz contraction. On a spacetime diagram such as that in the right panel Figure 1.10, 
show how two observers moving relative to each other can both consider the other to be contracted along 
the direction of motion. 


Concept question 1.9. Is one side of a cube shorter than the other? Figure 1.11 shows a picture 
of a 3-dimensional cube. Is one edge shorter than the other? Projected on to the page, it appears so, but in 
reality all the edges have equal length. In what ways is this situation similar or dissimilar to time dilation 
and Lorentz contraction in 4-dimensional relativity? 
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Figure 1.11 A cube. Are the lengths of its sides all equal? 


Exercise 1.10. Twin paradox. Your twin leaves you on Earth and travels to the spacestation Alpha, 


£ = 3 lyr away, at a good fraction of the speed of light, then immediately returns to Earth at the same speed. 


Figure 1.12 shows on a spacetime diagram the corresponding worldlines of both you and your twin. Aside 


from part 1 and the first part of 2, you should derive your answers mathematically, using logic and Lorentz 


transformations. However, the diagram is accurately drawn, and you should be able to check your answers 


by measuring. 


1. On a spacetime diagram such that in Figure 1.12, label the worldlines of you and your twin. Draw the 


worldline of a light signal which travels from you on Earth, hits Alpha just when your twin arrives, 
and immediately returns to Earth. Draw the twin’s “now” (line of simultaneity) when just arriving at 
Alpha, and the twin’s “now” (line of simultaneity) just departing from Alpha (in the first case the twin 
is moving toward Alpha, while in the second case the twin is moving back toward Earth). 


. From the diagram, measure the twin’s speed v relative to you, in units where the speed of light is unity, 


c= 1. Deduce the Lorentz gamma factor y, and the redshift factor 1 + z = [(1 + v)/(1 — v)]/?, in the 
cases (i) where the twin is receding, and (ii) where the twin is approaching. 

Choose the spacetime origin to be the event where the twin leaves Earth. Argue that the position 
4-vector of the twin on arrival at Alpha is 


{t, x,y, z} = {/v, £,0, 0} . (1.13) 


Lorentz transform this 4-vector to determine the position 4-vector of the twin on arrival at Alpha, in 
the twin’s frame. Express your answer first in terms of 4, v, and y, and then in (light)years. State in 
words what this position 4-vector means. 

How much do you and your twin age respectively during the round trip to Alpha and back? What is 
the ratio of these ages? Express your answers first in terms of £, v, and y, and then in years. 

What is the distance between the Earth and Alpha from the twin’s point of view? What is the ratio 
of this distance to the distance between Earth and Alpha from your point of view? Explain how your 
arrived at your result. Express your answer first in terms of ¢, v, and y, and then in lightyears. 

You watch your twin through a telescope. How much time do you see (through the telescope) elapse 
on your twin’s wristwatch between launch and arrival on Alpha? How much time passes on your own 
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Figure 1.12 Twin paradox spacetime diagram. 


wristwatch during this time? What is the ratio of these two times? Express your answers first in terms 
of £, v, and y, and then in years. 


. On arrival at Alpha, your twin looks back through a telescope at your wristwatch. How much time does 
your twin see (through the telescope) has elapsed since launch on your watch? How much time has 
elapsed on the twin’s own wristwatch during this time? What is the ratio of these two times? Express 
your answers first in terms of £, v, and y, and then in years. 


. You continue to watch your twin through a telescope. How much time elapses on your twin’s wristwatch, 
as seen by you through the telescope, during the twin’s journey back from Alpha to Earth? How much 
time passes on your own watch as you watch (through the telescope) the twin journey back from Alpha 
to Earth? What is the ratio of these two times? Express your answers first in terms of £, v, and y, and 
then in years. 


. During the journey back from Alpha to Earth, your twin likewise continues to look through a telescope 
at the time registered on your watch. How much time passes on your wristwatch, as seen by your twin 
through the telescope, during the journey back? How much time passes on the twin’s wristwatch from 


26 Special Relativity 


the twin’s point of view during the journey back? What is the ratio of these two times? Express your 
answers first in terms of £, v, and y, and then in years. 


Concept question 1.11. What breaks the symmetry between you and your twin? From your 
point of view, you saw the twin recede from you at velocity v on the outbound journey, then approach you 
at velocity v on the inbound journey. But the twin saw the essentially same thing: from the twin’s point of 
view, the twin saw you recede at velocity v on the outbound journey, then approach the twin at velocity 
v on the inbound journey. Isn’t the situation symmetrical, so shouldn’t you and the twin age identically? 
What breaks the symmetry, allowing your twin to age less? 


1.8 The spacetime wheel 
1.8.1 Wheel 


Figure 1.13 shows an ordinary 3-dimensional wheel. As the wheel rotates, a point on the wheel describes an 
invariant circle. The coordinates {x,y} of a point on the wheel relative to its centre change, but the distance 
r between the point and the centre remains constant 


r? = z? +y? = constant . (1.14) 
More generally, the coordinates {x,y,z} of the interval between any two points in 3-dimensional space (a 
vector) change when the coordinate system is rotated in 3 dimensions, but the separation r of the two points 
remains constant 


r? =x? +y? +27 = constant . (1.15) 


Figure 1.13 A wheel. 
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Figure 1.14 A spacetime wheel. 


1.8.2 Spacetime wheel 


Figure 1.14 shows a spacetime wheel. The diagram here is a spacetime diagram, with time t vertical and 
space x horizontal. A rotation between time t and space x is a Lorentz boost in the z-direction. As the 
spacetime wheel boosts, a point on the wheel describes an invariant hyperbola. The spacetime coordinates 
{t,x} of a point on the wheel relative to its centre change, but the spacetime separation s between the point 
and the centre remains constant 


s? = — t? +2 = constant . (1.16) 
More generally, the coordinates {t, x, y, z} of the interval between any two events in 4-dimensional spacetime 


(a 4-vector) change when the coordinate system is boosted or rotated, but the spacetime separation s of the 
two events remains constant, 


s=—-P +r Hy +2" = constant . (1.17) 


1.8.3 Lorentz boost as a rotation by an imaginary angle 


The — sign instead of a + sign in front of the ¢? in the spacetime separation formula (1.17) means that time 
t can often be treated mathematically as if it were an imaginary spatial dimension. That is, t = iw where 
i = y—1 and w is a “fourth spatial coordinate.” 

A Lorentz boost by a velocity v can likewise be treated as a rotation by an imaginary angle. Consider a 
normal spatial rotation in which a primed frame is rotated in the wa-plane clockwise by an angle a about 
the origin, relative to the unprimed frame. The relation between the coordinates {w’,«’} and {w, x} of a 


o Pah (1.18) 
x sina cosa x 


point in the two frames is 
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Now set t = iw and a = ia with t and a both real. In other words, take the spatial coordinate w to be 
imaginary, and the rotation angle a likewise to be imaginary. Then the rotation formula above becomes 


1 a 
Ewa ni 
This agrees with the usual Lorentz transformation formula (1.5) if the boost velocity v and boost angle a 
are related by 
v= tanha , (1.20) 
so that 


y=cosha, yu=sinha. (1.21) 


The boost angle œ is commonly called the rapidity. This provides a convenient way to add velocities in 
special relativity: the rapidities simply add (for boosts along the same direction), just as spatial rotation 
angles add (for rotations about the same axis). Thus a boost by velocity vı = tanh qa; followed by a boost 
by velocity vg = tanh a, in the same direction gives a net velocity boost of v = tanh a where 


Q = Q1 T Q2. (1.22) 


The equivalent formula for the velocities themselves is 


U1 T U2 


Ss 1.23 
ne aT (1.23) 


the special relativistic velocity addition formula. 


1.8.4 Trip across the Universe at constant acceleration 


Suppose that you took a trip across the Universe in a spaceship, accelerating all the time at one Earth 
gravity g. How far would you travel in how much time? 

The spacetime wheel offers a cute way to solve this problem, since the rotating spacetime wheel can be 
regarded as representing spacetime frames undergoing constant acceleration. Points on the right quadrant of 
the rotating spacetime wheel, Figure 1.15, represent worldlines of persons who accelerate with constant ac- 
celeration in their own frame. The spokes of the spacetime wheel are lines of simultaneity for the accelerating 
persons. 

If the units of space and time are chosen so that the speed of light and the gravitational acceleration are 
both one, c = g = 1, then the proper time experienced by the accelerating person is the rapidity a, and the 
time and space coordinates of the accelerating person, relative to a person who remains at rest, are those of 
a point on the spacetime wheel, namely 


{t, z} = {sinha, cosha} . (1.24) 
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Figure 1.15 The right quadrant of the spacetime wheel represents the worldlines and lines of simultaneity of persons 
who accelerate in the x direction with uniform acceleration in their own frames. 


In the case where the acceleration is one Earth gravity, g = 9.80665 m s7?, the unit of time is 


c  299,792,458ms71 
g 9.80665 ms~2 


= 0.97 yr , (1.25) 


Table 1.1: Trip across the Universe. 


Time elapsed Time elapsed 


on spaceship on Earth Distance travelled To 
in years in years in lightyears 
a sinh a cosha — 1 
0 0 0 Earth (starting point) 
1 1.175 5431 
2 3.627 2.762 
2.34 5.12 4.22 Proxima Cen 
3.962 26.3 25.3 Vega 
6.60 368 367 Pleiades 
10.9 2.7 x 104 2.7 x 104 Centre of Milky Way 
15.4 2.44 x 10° 2.44 x 10° Andromeda galaxy 
18.4 4.9 x 107 4.9 x 107 Virgo cluster 
19.2 1.1 x 108 1.1 x 108 Coma cluster 
25.3 5 x 101° 5 x 101° Edge of observable Universe 
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just short of one year. For simplicity, Table 1.1, which tabulates some milestones along the way, takes the 
unit of time to be exactly one year, which would be the case if you were accelerating at 0.97 g = 9.5 ms7?. 

After a slow start, you cover ground at an ever increasing rate, crossing 50 billion lightyears, the distance 
to the edge of the currently observable Universe, in just over 25 years of your own time. 

Does this mean you go faster than the speed of light? No. From the point of view of a person at rest 
on Earth, you never go faster than the speed of light. From your own point of view, distances along your 
direction of motion are Lorentz-contracted, so distances that are vast from Earth’s point of view appear 
much shorter to you. Fast as the Universe rushes by, it never goes faster than the speed of light. 

This rosy picture of being able to flit around the Universe has drawbacks. Firstly, it would take a huge 
amount of energy to keep you accelerating at g. Secondly, you would use up a huge amount of Earth time 
travelling around at relativistic speeds. If you took a trip to the edge of the Universe, then by the time 
you got back not only would all your friends and relations be dead, but the Earth would probably be gone, 
swallowed by the Sun in its red giant phase, the Sun would have exhausted its fuel and shrivelled into a 
cold white dwarf star, and the Solar System, having orbited the Galaxy a thousand times, would be lost 
somewhere in its milky ways. 

Technical point. The Universe is expanding, so the distance to the edge of the currently observable Universe 
is increasing. Thus it would actually take longer than indicated in the table to reach the edge of the currently 
observable Universe. Moreover if the Universe is accelerating, as evidence from the Hubble diagram of Type Ia 
Supernovae indicates, then you will never be able to reach the edge of the currently observable Universe, 
however fast you go. 


Exercise 1.12. Length of a particle accelerator that reaches the GUT or Planck scale. Consider 
a linear particle accelerator able to accelerate particles at constant acceleration g in the particles’ own 
frame. 
1. How long must the particle accelerator be to reach a Lorentz gamma factor of y? 
2. Estimate the acceleration g for a contemporary accelerator such as the Large Hadron Collider. 
3. Estimate the length of a particle accelerator needed to accelerate a proton, rest mass 1 GeV, to a GUT 
energy of 1016 GeV, or alternatively to a Planck energy of 101° GeV. 
4. Show that a GUT density of 1 GUT mass per (GUT length)? is about 10°! times the density of water. 
Approximately what is the Planck density relative to the density of water? 
5. To what Lorentz y factor would you have to accelerate two rocks so that they achieve a GUT or Planck 
density when slammed together? How long would the particle accelerator be to achieve this y factor? 
Solution. 
1. The rapidity a achieved by a particle that accelerates at constant acceleration g in its own frame for a 
proper time 7T is 


a=. (1.26) 


The Lorentz gamma factor y is related to the rapidity by y = cosh a, equation (1.21). The distance x 
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the particle moves in the background frame is 


ee c? 
x = —(cosha — 1) = —(y-1). (1.27) 
g g 


In the highly relativistic regime, y >> 1, the distance travelled is 
2 
ra L, (1.28) 
g 


The distance x increases linearly with +. 

2. The Large Hadron Collider (LHC) accelerates protons and heavier nuclei to energies of order 1 TeV, 
whereat a proton has a gamma factor of y ~ 10°. The acceleration occurs over scales of kilometres, or 
10° m. So the acceleration is about one per metre, 


g 


3 Im™t. (1.29) 


3. A GUT energy of 1018 GeV requires a gamma factor of 1018, hence a particle accelerator of length 
z 1016m 7a 1lyr. (1.30) 
A Planck energy of 10!9 GeV requires a particle accelerator of length 
x œ 10*°? m ~ 1000lyr . (1.31) 


4. The Planck energy 101° GeV is 10? higher than the GUT density 1016 GeV. The Planck density is then 
(10°)* = 10! times higher than the GUT density of 108! gmcm7*. The Planck density is 1093 gm cm~. 
5. When two objects are slammed together at Lorentz factor y, the energy of each object is enhanced by 
a factor y, and the length of each object is contracted along the direction of motion by another factor 
of y, so overall the density is increased by a factor of y?. To reach a GUT density of 10°! gm cm~? 
by slamming together two rocks of initial density say 10gmcm~? would require a gamma factor of 
V1089 = 104°. Which would require a particle accelerator of length 104° m, or 1024 lyr, or about 10!4 
times the size of the observable Universe. 


1.9 Scalar spacetime distance 


The fact that Lorentz transformations leave unchanged a certain distance, the spacetime distance, between 
any two events in spacetime is one the most fundamental features of Lorentz transformations. The scalar 
spacetime distance As between two events separated by {At, Ax, Ay, Az} is given by 
As? = — Af? + Ar? 
At? + Az? + Ay? + Az?. (1.32) 
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A quantity such as As? that remains unchanged under any Lorentz transformation is called a scalar. You 
should check yourself that As? is unchanged under Lorentz transformations (see Exercise 1.14). Lorentz 
transformations can be defined as linear spacetime transformations that leave As? invariant. 

The single scalar spacetime squared interval As? replaces the two scalar quantities 


time interval At 
mie (1.33) 
spatial interval Ar = ,/Az? + Ay? + Az? 


of classical Galilean spacetime. 


1.9.1 Timelike, lightlike, spacelike 


The scalar spacetime distance squared As?, equation (1.32), between two events can be negative, zero, or 
positive. A spacetime interval {At, Ax, Ay, Az} = {At, Ar} is called 


timelike if At > Ar or equivalently if As? <0, 
null or lightlike if At = Ar or equivalently if As? = 0 , (1.34) 
spacelike if At < Ar or equivalently if As? > 0, 


as illustrated in Figure 1.16. 


Figure 1.16 Spacetime diagram illustrating timelike, lightlike, and spacelike intervals. 


1.9.2 Proper time, proper distance 


The scalar spacetime distance squared As? has a physical meaning. 
If an interval {At, Ar} is timelike, At > Ar, then the square root of minus the spacetime interval squared 


is the proper time Az along it 


Ar = V—As? = VAt? — Ar? . (1.35) 


This is the time experienced by an observer moving along that interval. 
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If an interval {At, Ar} is spacelike, At < Ar, then the spacetime interval equals the proper distance 
Al along it 


Al = VAs? = VAr? — At? . (1.36) 


This is the distance between two events measured by an observer for whom those events are simultaneous. 


Concept question 1.13. Proper time, proper distance. Justify the assertions (1.35) and (1.36). 


1.9.3 Minkowski metric 
It is convenient to denote an interval using an index notation, 
Ax” = {At, Ar} = { At, Ax, Ay, Az} . (1.37) 


The indices run over m = t, £, y, z, or sometimes m = 0, 1,2,3. The scalar spacetime length squared As? of 
an interval Ax™ can be abbreviated 


AS? = Nmn AT” Ar” , (1.38) 
where Nmn is the Minkowski metric 
—1 0 0 0 
= 0 1 0 0 
0 0 0 1 


Equation (1.38) uses the implicit summation convention, according to which paired indices, one lowered 
and one raised, are implicitly summed over. 


1.10 4-vectors 


1.10.1 Contravariant 4-vector 


Under a Lorentz transformation, a coordinate interval Az™ transforms as 
Az” + Ag” = L™, Az” , (1.40) 


where L™,, denotes a Lorentz transformation. The paired indices n on the right hand side of equation (1.40), 
one lowered and one raised, are implicitly summed over. In matrix notation, L™, is a 4 x 4 matrix. For 
example, for a Lorentz boost by velocity v along the z-axis, L’, is the matrix on the right hand side of 
equation (1.5). 

In special relativity a contravariant 4-vector is defined to be a quantity 


a™ = {a a", a”, a}, (1.41) 
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that transforms under Lorentz transformations like an interval Aa of spacetime, 
a" oa/™=L", a". (1.42) 


The indices run over m = t, x,y,z, or sometimes m = 0,1, 2, 3. 


1.10.2 Covariant 4-vector 


In special and general relativity, besides the contravariant 4-vector a™, with raised indices, it is convenient 
to introduce a covariant 4-vector am, with lowered indices, obtained by multiplying the contravariant 
4-vector by the metric, 


adm = Nna . (1.43) 
With the Minkowski metric (1.39), the covariant components am are 
am = {—a", a", a”, a°}, (1.44) 
which differ from the contravariant components a™ only in the sign of the time component. 


The reason for introducing the two species of vector is that their implicitly summed product 


m = m-on 
a Am = Nmn a 


= aat + aza” + aya” + aza” 
= — (a)? + (aY? + (a)? + (a7)? (1.45) 


is a Lorentz scalar, a fact you will prove in Exercise 1.14. 

The notation may seem overly elaborate, but it proves extremely useful in general relativity, where the 
metric is more complicated than Minkowski. Further discussion of the formalism of 4-vectors is deferred to 
Chapter 2. 


Exercise 1.14. Scalar product. Suppose that a’ and b™ are two 4-vectors. Show that a,,b™ is a scalar, 
that is, it is unchanged by any Lorentz transformation. [Hint: For the Minkowski metric of special relativity, 
amb™ = — atb + a®b* +aYbY + a7b*. Show that a/b!" = amb™. You may assume without proof the familiar 
result that the 3D scalar product a.b = ab” + a¥b¥ + a*b* of two 3-vectors is unchanged by any spatial 
rotation, so it suffices to consider a Lorentz boost, say in the x direction.| 


Exercise 1.15. The principle of longest proper time. Consider a person whose worldline goes from 
spacetime event Po to spacetime event P; at velocity vı relative to some inertial frame, and then from Pı 
to spacetime event P> at velocity v2, as illustrated in Figure 1.17. Assume for simplicity that the velocities 
are both in the (positive or negative) z-direction. Show that the proper time along a straight line from Po 
to P, is always greater than or equal to the sum of the proper times along the two straight lines from Pp 
to Pı followed by Pı to Pə. Hence conclude that the longest proper time between two events is a straight 
line. What does this imply about the twin paradox? [Hint: It is simplest to use rapidities aœ rather than 
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Py 


> 


Po x 
Figure 1.17 The longest proper time between Po and P> is a straight line. 
velocities. Let the segment from Po to P, be {t1, £1} = 7, {cosh aj, sinh qı}, and the segment from P, to P> 


be {t2, 2} = Te{cosh ag, sinh ag}. The segment from Po to P> is the sum of these, {t,x} = {t1 + te, £1 + 22}. 
Show that 


T? — (Ti +72)? = 47172 sinh? (== 3 =) ; (1.46) 


which is a minimum for az = aı.] 


1.11 Energy-momentum 4-vector 


The foremost example of a 4-vector other than the interval Az” is the energy-momentum 4-vector. 

One of the great insights of modern physics is that conservation laws are associated with symmetries. 
The Principle of Special Relativity asserts that the laws of physics should take the same form at any point. 
There is no preferred origin in spacetime in special relativity. In special relativity, spacetime has translation 
symmetry with respect to both time and space. Associated with those symmetries are laws of conservation 
of energy and momentum: 


Symmetry Conservation law 


Time translation Energy 
Space translation Momentum 


Since one-dimensional time and three-dimensional space are united in special relativity, this suggests that 
the single component of energy and the three components of momentum should be combined into a 4-vector: 


= ti t 
cota Peet ae ee \ of a 4-vector. (1.47) 


momentum = space component 
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The Principle of Special Relativity requires that the equation of energy-momentum conservation 


enerey L constant (1.48) 
momentum 


should take the same form in any inertial frame. The equation should be Lorentz covariant, that is, the 
equation should transform like a Lorentz 4-vector. 


1.11.1 Construction of the energy-momentum 4-vector 


The energy-momentum 4-vector of a particle of mass m at position {t,r} moving at velocity v = dr/dt can 
be derived by requiring 

1. that is a 4-vector, and 

2. that it goes over to the Newtonian limit as v > 0. 
In the Newtonian limit, the 3-momentum p equals mass m times velocity v, 


To obtain a 4-vector, two things must be done to the Newtonian momentum: 
1. replace r by a 4-vector x” = {t,r}, and 
2. replace dt by a scalar; the only available scalar measure of time is the proper time interval dr along the 
worldline of the particle. 
The result is the energy-momentum 4-vector p”: 


- dx” 
= m 
p dr 
_ dt dr 
=m ae dr 
= m{7, yo} . (1.50) 


The components of the energy-momentum 4-vector are the special relativistic versions of energy E and 
momentum p, 


p” ={E, p} ={my, myv} . (1.51) 


1.11.2 Special relativistic energy 


From equation (1.51), the special relativistic energy E is the product of the rest mass and the Lorentz 
y-factor, 


E=my (unitsc=1), (1.52) 
or, restoring standard units, 


E=me’y. (1.53) 
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For small velocities v, the Taylor expansion of the Lorentz factor y is 


1 1v? 
= =1+ + 1.54 
= ae T (1.54) 

Thus for small velocities, the special relativistic energy E Taylor expands as 
1v? 
E=m?|1+55 +.. 
mc ( oh Ie T ) 
2,1 2 

= mé + mv ++... (1.55) 


2 


The first term, mc’, is the rest-mass energy. The second term, mv’, is the non-relativistic kinetic energy. 
Higher-order terms give relativistic corrections to the kinetic energy. 
Einstein did not discard the constant term, but rather interpreted it seriously as indicating that mass 


contains energy, the rest-mass energy 
E=me , (1.56) 


perhaps the most famous equation in all of physics. 


1.11.3 Rest mass is a scalar 


The scalar quantity constructed from the energy-momentum 4-vector p” = {E, p} is 


Pap” =- E? +p? 
=- m? (7? — 707) 
=-m', (1.57) 


minus the square of the rest mass. The minus sign is associated with the choice —+++ of metric signature 
in this book. 

Elementary texts sometimes state that special relativity implies that the mass of a particle increases as its 
velocity increases, but this is a confusing way of thinking. Mass is rest mass m, a scalar, not to be confused 
with energy. That being said, Einstein’s famous equation (1.56) does suggest that rest mass is a form of 
energy, and indeed that proves to be the case. Rest mass is routinely converted into energy in chemical or 
nuclear reactions that liberate heat. 


1.12 Photon energy-momentum 


The energy-momentum 4-vectors of photons are of special interest because when you move through a scene 
at near the speed of light, the scene appears distorted by the Lorentz transformation of the photon 4-vectors 
that you see. 
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A photon has zero rest mass 
m=0. (1.58) 


Its scalar energy-momentum squared is thus zero, 


Consequently the 3-momentum of a photon equals its energy (in units c = 1), 
p=|pl=E. (1.60) 
The energy-momentum 4-vector of a photon therefore takes the form 
p” = {E, p} 
= E{1, n} 
= hv{l, n} (1.61) 


where v is the photon frequency. The photon velocity is n, a unit vector. The photon speed is one, the speed 
of light. 


1.12.1 Lorentz transformation of the photon energy-momentum 4-vector 


The energy-momentum 4-vector p™ of a photon follows the usual rules for 4-vectors under Lorentz transfor- 
mations. In the case that the emitter (primed frame) is moving at velocity v along the z-axis relative to the 
observer (unprimed frame), the transformation is 


p“ y =w 0 0 p (p — vp”) 
p? —yv 0 0 p” p” — vp’ 
„le 2 2 a l= 1 g ) |. (1.62) 
p 0 0 1 0 p p 
p” 0 0 0 1 p” p” 
Equivalently 
1 y  -yw 0 0 1 y(1 — n*v) 
I£ T zx 
i| n o |-w y 00 n _ y(n? — v) 
hv aa |= 0 0 10 hv ca = hv nd : (1.63) 
n’? 0 0 Q l n” n” 


These mathematical relations imply the rules of 4-dimensional perspective, §1.13.2. 


1.12.2 Redshift 


The wavelength A of a photon is related to its frequency v by 
AN=c/v. (1.64) 
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Astronomers define the redshift z of a photon by the shift of the observed wavelength Aobs compared to its 
emitted wavelength Aem, 
Aobs = Aem 


ZOR m; 1.65 
z = (1.65) 


In relativity, it is often more convenient to use the redshift factor 1 + z, 
Aobs Vem 


1l+z= = š 1.66 
i Aem Vobs ( ) 


Sometimes it is useful to use a blueshift factor which is just the reciprocal of the redshift factor, 


1 Aem Vobs 
= = z 1.67 
l+z Aobs Vem ( ) 


1.12.3 Special relativistic Doppler shift 


If the emitter frame (primed) is moving with velocity v in the x-direction relative to the observer frame 
(unprimed) then the emitted and observed frequencies are related by, equation (1.63), 


hVem = hvopsy(1 — n*v) . (1.68) 
The redshift factor is therefore 
l1+z= Vem 
Vobs 
= (1 — n*v) 
ee (1.69) 


Equation (1.69) is the general formula for the special relativistic Doppler shift. In special cases, 


= 
a velocity directly towards observer (v aligned with n) , 
+v 
1+z= y velocity in the transverse direction (v -n = 0), (1.70) 
T+ 
— velocity directly away from observer (v anti-aligned with n). 
—v 


1.13 What things look like at relativistic speeds 


1.13.1 Light travel time effects 


When you move through a scene at near the speed of light, the scene appears distorted not only by time 
dilation and Lorentz contraction, but also by differences in the light travel time from different parts of the 
scene. The effect of differential light travel times is comparable to the effects of time dilation and Lorentz 
contraction, and cannot be ignored. 
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An excellent way to see the importance of light travel time is to work through the twin paradox, Exer- 
cise 1.10. Nature provides a striking example of the importance of light travel time in the form of superluminal 
(faster-than-light) jets in galaxies, the subject of Exercise 1.16. 


Exercise 1.16. Superluminal jets. 

Radio observations of galaxies show in many cases twin jets emerging from the nucleus of the galaxy. The 
jets are typically narrow and long, often penetrating beyond the optical extent of the galaxy. The jets are 
frequently one-sided, and in some cases that are favourable to observation the jets are found to be superlu- 
minal. A celebrated example is the giant elliptical galaxy M87 at the centre of the Local Supercluster, whose 
jet is observed over a broad range of wavelengths, including optical wavelengths. Hubble Space Telescope 
observations, Figure 1.18, show blobs in the M87 jet moving across the sky at approximately 6c. 

1. Draw a spacetime diagram of the situation, in Earth’s frame of reference. Assume that the velocity of 
the galaxy M87 relative to Earth is negligible. Let the x-axis be the direction to M87. Choose the y-axis 
so the jet lies in the z—y-plane. Let the jet be moving at velocity v at angle 0 away from the direction 
towards us on Earth, so that its spatial velocity relative to Earth is v = {vz, vy} = {—v cos 0, v sin 6}. 

2. In Earth coordinates {t, x,y}, the jet moves in time t a distance l = {1,,1,} = vt. Argue that during an 


1994 


1995 


1996 


1997 


1998 


6.0 5.5 6.1 6.0 


Figure 1.18 The left panel shows an image of the galaxy M87 taken with the Advanced Camera for Surveys on the 
Hubble Space Telescope. A jet, bluish compared to the starry background of the galaxy, emerges from the galaxy’s 
central nucleus. Radio observations, not shown here, reveal that there is a second jet in the opposite direction. Credit: 
STScI/AURA. The right panel is a sequence of Hubble images showing blobs in the jet moving superluminally, at 
approximately 6c. The slanting lines track the moving features, with speeds given in units of c. The upper strip shows 
where in the jet the blobs were located. Credit: John Biretta, STScI. 
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Earth time t, the jet has moved a distance l, nearer to the Earth (the distances J, and l, are both tiny 
compared to the distance to M87), so the apparent time as seen through a telescope is not t, but rather 
t diminished by the light travel time lẹ (units c = 1). Hence conclude that the apparent transverse 
velocity on the sky is 

vsin 


(1.71) 


Upp = ———— 
PP 1—vcos0 


3. Sketch the apparent velocity Vapp as a function of 0 for some given velocity v. In terms of v and the 
Lorentz factor y, what are the values of 0 and of vapp at the point where Vapp reaches its maximum? 
What can you conclude about the jet in M87? 

4. What is the expected redshift 1+ z, or equivalently blueshift 1/(1 + z), of the jet as a function of v and 
0? By expressing v in terms of Vapp and 0 using equation (1.71), show that the blueshift factor is 


1 
1+z 

[Hint: Remember to use the correct redshift formula, equation (1.69).| 

5. In terms of Vapp, at what value of @ is the blueshift (i) infinite, or (ii) zero? What are these angles in 
the case of M87? If the redshift of the jet were measurable, could you deduce the velocity v and opening 
angle 0? Unfortunately the redshift of a superluminal jet is not usually observable, because the emission 
is a continuum of synchrotron emission over a broad range of wavelengths, with no sharp atomic or ionic 
lines to provide a redshift. 

6. Why is the opposing jet not visible? 


= gi + Wapp Cot — Upp - (1.72) 


1.13.2 The rules of 4-dimensional perspective 


The distortion of a scene when you move through it at near the speed of light can be calculated most directly 
from the Lorentz transformation of the energy-momentum 4-vectors of the photons that you see. The result 
is what I call the “Rules of 4-dimensional perspective.” 

Figure 1.19 illustrates the rules of 4-dimensional perspective, also called “special relativistic beaming,” 
which describe how a scene appears when you move through it at near light speed. 

On the left, you are at rest relative to the scene. Imagine painting the scene on a celestial sphere around 
you. The arrows represent the directions of light rays (photons) from the scene on the celestial sphere to you 
at the center. 

On the right, you are moving to the right through the scene, at 0.8 times the speed of light. The celestial 
sphere is stretched along the direction of your motion by the Lorentz gamma-factor y = 1/V1 — 0.8? = 5/3 
into a celestial ellipsoid. You, the observer, are not at the centre of the ellipsoid, but rather at one of its foci 
(the left one, if you are moving to the right). The focus of the celestial ellipsoid, where you the observer are, is 
displaced from centre by yv = 4/3. The scene appears relativistically aberrated, which is to say concentrated 
ahead of you, and expanded behind you. 

The lengths of the arrows are proportional to the energies, or frequencies, of the photons that you see. 
When you are moving through the scene at near light speed, the arrows ahead of you, in your direction 
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Observer Œ 0 = 0.8 


Figure 1.19 The rules of 4-dimensional perspective. In special relativity, the scene seen by an observer moving through 
the scene (right) is relativistically beamed compared to the scene seen by an observer at rest relative to the scene 
(left). On the left, the observer at the center of the circle is at rest relative to the surrounding scene. On the right, 
the observer is moving to the right through the same scene at v = 0.8 times the speed of light. The arrowed lines 
represent energy-momenta of photons. The length of an arrowed line is proportional to the perceived energy of the 
photon. The scene ahead of the moving observer appears concentrated, blueshifted, and farther away, while the scene 
behind appears expanded, redshifted, and closer. 


of motion, are longer than at rest, so you see the photons blue-shifted, increased in energy, increased in 
frequency. Conversely, the arrows behind you are shorter than at rest, so you see the photons red-shifted, 
decreased in energy, decreased in frequency. Since photons are good clocks, the change in photon frequency 
also tells you how fast or slow clocks attached to the scene appear to you to run. 

This table summarizes the four effects of relativistic beaming on the appearance of a scene ahead of you 
and behind you as you move through it at near the speed of light: 


Effect Ahead Behind 
Aberration Concentrated Expanded 
Colour Blueshifted Redshifted 
Brightness Brighter Dimmer 
Time Speeded up Slowed down 


Mathematical details of the rules of 4-dimensional perspective are explored in the next several Exercises. 


Exercise 1.17. The rules of 4-dimensional perspective. 
1. In terms of the photon energy-momentum 4-vector p* in an unprimed frame, what is the photon energy 
momentum 4-vector p% in a primed frame of reference moving at speed v in the « direction relative to 
the unprimed frame? Argue that the photon 4-vectors in the unprimed and primed frames are related 
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geometrically by the “celestial ellipsoid” transformation illustrated in Figure 1.19. Bear in mind that the 
photon vector is pointed towards the observer. 

2. Aberration. The photon 4-vector seen by an observer is the null vector p* = E(1,—n), where E is the 
photon energy, and n is a unit 3-vector in the direction away from the observer, the minus sign taking 
into account the fact that the photon vector is pointed towards the observer. An object appears in the 
unprimed frame at angle @ to the x-direction and in the primed frame at angle 6’ to the z-direction. 
Show that u’ = cos 6’ and u = cos@ are related by 


,_ BTV 
P = Ifoga’ 


(1.73) 


3. Redshift. By what factor a = E’/F is the observed photon frequency from the object changed? Express 
your answer as a function of y, v, and p. 

4. Brightness. Photons at frequency E in the unprimed frame appear at frequency E’ in the primed 
frame. Argue that the brightness F(E), the number of photons per unit time per unit solid angle per 
log interval of frequency (about E in the unprimed frame, and Æ” in the primed frame), 


dN(E) 
F(E) = ——— 1.74 
() dtdodlnE ’ ee 
goes as 
F'(E’ E’ 
(E')_ EE’ dp _ 3 (1.75) 


FŒ) Edw — 


[Hint: Photons number conservation implies that dN’(E’) = dN (E).] 
5. Time. By what factor does the rate at which a clock ticks appear to change? 


Exercise 1.18. Circles on the sky. Show that a circle on the sky Lorentz transforms to a circle on the sky. 
Let the primed frame be moving at velocity v in the x-direction, let 0 be the angle between the x-direction 
and the direction m to the center of the circle, and let a be the angle between the circle axis m and the 
photon direction n. Show that the angle 6’ in the primed frame is given by 

sin 0 


tan 6’ = 1.76 
i y(u cosa + cos 0) ’ Ga 


and that the angular radius a’ in the primed frame is given by 


sina 
tana’ = 1.77 
ana (cos a + v cos 6) eae 


[This result was first obtained by Penrose (1959) and Terrell (1959), prior to which it had been widely 
thought that circles would appear Lorentz-contracted and therefore squashed. The following simple proof 
was told to me by Engelbert Schucking (NYU). The set of null 4-vectors p* = E{1,—n} on the circle 
satisfies the Lorentz-invariant equation zp = 0, where x* = |x|{— cosa, m} is a spacelike 4-vector whose 
spatial components |x|m point to the center of the circle. Note that |x| is a magnitude of a 3-vector, not a 
Lorentz-invariant scalar.| 
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Exercise 1.19. Lorentz transformation preserves angles on the sky. From equation (1.73), show 
that the angular metric do? = d0? + sin?6 d¢? on the sky Lorentz transforms as 
1—0? 2 


do’ = 
This kind of transformation, which multiplies the metric by an overall factor, called a conformal factor, is 
called a conformal transformation. The conformal transformation (1.78) of the angular metric shrinks 


and expands patches on the sky while preserving their shapes, that is, while preserving angles between lines. 


Exercise 1.20. The aberration of starlight. The aberration of starlight was discovered by James Bradley 
(1728) through precision measurements of the position of y Draconis observed from London with a specially 
commissioned “zenith sector.” Stellar aberration results from the annual motion of the Earth about the 
Sun. Calculate the size of the effect, in arcseconds. Are special relativistic effects important? How does the 
observational signature of stellar aberration differ from that of stellar parallax? 


Concept question 1.21. Apparent (affine) distance. The rules of 4-dimensional perspective illustrated 
in Figure 1.19 suggest that when you move through a scene at near lightspeed, the scene ahead looks farther 
away (and not Lorentz-contracted at all). Is the scene really farther away, or is it just an illusion? Answer. 
What is reality? In a deep sense, reality is what can be observed (by something, not necessarily a person). 
So yes, the scene ahead really is farther away. Let the observer take a tape measure that is at rest relative 
to the observer, and lay it out to the emitter. The laying has to be done in advance, because the emitter 
is moving. Observers who move at different velocities lay out tapes that move at different velocities. The 
observer moving faster toward the emitter indeed sees the emitter farther away, according to their tape 
measure. The distance measured in this fashion is called the affine distance, §2.18, a measure of distance 
along the past lightcone of the observer. 


1.14 Occupation number, phase-space volume, intensity, and flux 


Exercise 1.17 asked you to discover how the appearance of an emitter changes when the observer boosts 
into a different frame. The change (1.75) in brightness can be derived at a more fundamental level from the 
concepts of occupation number and phase-space volume. 

The intensity of light can be described by the number dN of photons in a 3-volume element d?r of space 
(as measured by an observer in their own rest frame) with momenta in a 3-volume element d?p of momentum 
(again as measured by an observer). The 6-dimensional product d?r dp of spatial and momentum 3-volumes, 
called the phase-space volume, is Lorentz-invariant, unchanged by a boost or rotation of the observer’s frame 
(see §10.26.1 for a proof). Indeed, as shown in §4.22, the phase-space volume element. d?r dp is invariant 
under a wide range of transformations (called canonical transformations, §4.17). 

In quantum mechanics, the phase volume divided by (27h)? (which is the same as h®; but in quantum 
mechanics f is a more natural unit; for example, angular momentum is quantized in units of A, and spin in 
units of 5h) counts the number of free states of particles, here photons. Particles typically have spin, and an 
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associated discrete number of distinct spin states. Photons have spin 1, and two spin states. The occupation 
number f(t,r,p) is defined to be the number of photons per state at time t and spatial position r with 
momenta p. The number dN of photons is the product of the occupation number f, the number g of spin 
states, and the number d?r d3p/(27h)? of free quantum states, 

dN(t,r,p) = f(t,r,p) orks (1.79) 
The number dN of photons, the occupation number f, the number g of spin states, and the phase volume 
d?r d°p/(27h)? are all Lorentz invariant. 

Astronomers conventionally define the intensity I, of light observed from an object to be the energy 
received per unit time t per unit area A (of the telescope mirror or lens) per unit solid angle o per unit 
frequency v. Often intensity is quoted per unit wavelength À or per unit energy E instead of per unit 
frequency v, and the intensity is subscripted accordingly, J, or Ig. The intensity measures are related by 
I, dv = I) dà = IgdE with \ = c/v and E = 2rħv. The intensity Ip per unit energy is related to the 
occupation number f by 


EdN gp? 


= ada 80) 


tp (27h) ’ 


the spatial and momentum 3-volumes being d?r = cdt dA and dp = p*dp do. The p? factor in equation (1.80) 
reproduces the brightness factor a? = (E’/E)? in equation (1.75). 

Stars typically appear to astronomers as point sources. Astronomers define the flux F, from a source to be 
the intensity I, integrated over the solid angle of the source. Again, flux is often quoted per unit wavelength 
A or per unit energy E, and subscripted accordingly, Fy or Fg. 


Concept question 1.22. Brightness of a star. How does the flux from a star change when an observer 
boosts into another frame? The flux that an observer, or a telescope, actually sees depends on the spectrum 
of the light incident on the observer (the flux as a function of photon energy) and on the sensitivity of the 
detector as a function of photon energy. But imagine a perfect detector that sees all photons incident on it, 
of any photon energy. 

Solution. The flux Fp in an interval dE of energy is 


EdN gp? 
Fg = = ; 1.81 
E= dtdAdE  < (2nh)3 [fa on 


2 


Since the solid angle varies as do x p™ 4, while the occupation number f is Lorentz invariant, and the photon 


energy and momentum are related by E = pc, the flux Fp varies as 
FgexE, (1.82) 


that is, the flux is proportional to the blueshift factor. Physically, the observed number of photons per unit 
time increases in proportion to the photon frequency. The flux integrated over dln E counts the total number 
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of photons observed per unit time, which again increases in proportion to the blueshift factor, 
| FeamB xB. (1.83) 


The flux integrated over dE counts the total energy observed per unit time, which increases as the square 
of the blueshift factor, 


fr dE x E’. (1.84) 


1.15 How to program Lorentz transformations on a computer 


3D gaming programmers are familiar with the fact that the best way to program spatial rotations on a 
computer is with quaternions. Compared to standard rotation matrices, quaternions offer increased speed 
and require less storage, and their algebraic properties simplify interpolation and splining. 

Section 1.8 showed that a Lorentz boost is mathematically equivalent to a rotation by an imaginary 
angle. Thus suggests that Lorentz transformations might be treated as complexified spatial rotations, which 
proves to be true. Indeed, the best way to program Lorentz transformation on a computer is with complex 
quaternions, §14.5. 


Figure 1.20 Tachyon spacetime diagram. 


Exercise 1.23. Tachyons. A tachyon is a hypothetical particle that moves faster than the speed of light. The 
purpose of this problem is to discover that the existence of tachyons would imply a violation of causality. 
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. On a spacetime diagram such as that in Figure 1.20, show how a tachyon emitted by Vermilion at speed 
v > 1 can appear to go backwards in time, with v < —1, in another frame, that of Cerulean. 

. What is the smallest velocity that Cerulean must be moving relative to Vermilion in order that the 
tachyon appears to go backwards in Cerulean’s time? 

. Suppose that Cerulean returns the tachyonic signal at the same speed v > 1 relative to his own frame. 
Show on the spacetime diagram how Cerulean’s tachyonic signal can reach Vermilion before she sent 
out the original tachyon. 

. What is the smallest velocity that Cerulean must be moving relative to Vermilion in order that his 
tachyon reach Vermilion before she sent out her tachyon? 

. Why is the situation problematic? 

. If it is possible for Vermilion to send out a particle with v > 1, do you think it should also be possible 
for her to send out a particle backward in time, with v < —1, from her point of view? Explain how she 
might do this, or not, as the case may be. 
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10. 
. You have a box of negative mass particles, and you remove energy from it. Do the particles move faster 


12: 


13. 
14. 


15. 
16. 


Concept Questions 


What assumption of general relativity makes it possible to introduce a coordinate system? 

Is the speed of light a universal constant in general relativity? If so, in what sense? 

What does “locally inertial” mean? How local is local? 

Why is spacetime locally inertial? 

What assumption of general relativity makes it possible to introduce clocks and rulers? 

Consider two observers at the same point and with the same instantaneous velocity, but one is acceler- 
ating and the other is in free-fall. What is the relation between the proper time or proper distance along 
an infinitesimal interval measured by the two observers? What assumption of general relativity implies 
this? 

Does Einstein’s principle of equivalence imply that two unequal masses will fall at the same rate in a 
gravitational field? Explain. 

In what respects is Einstein’s principle of equivalence (gravity is equivalent to acceleration) stronger 
than the weak principle of equivalence (gravitating mass equals inertial mass)? 

Standing on the surface of the Earth, you hold an object of negative mass in your hand, and drop it. 
According to the principle of equivalence, does the negative mass fall up or down? 

Same as the previous question, but what does Newtonian gravity predict? 


or slower? Does the entropy of the box increase or decrease? Does the pressure exerted by the particles 
on the walls of the box increase or decrease? 

You shine two light beams along identical directions in a gravitational field. The two light beams are 
identical in every way except that they have two different frequencies. Does the equivalence principle 
imply that the interference pattern produced by each of the beams individually is the same? 

What is a “straight line,” according to the principle of equivalence? 

If all objects move on straight lines, how is it that when, standing on the surface of the Earth, you throw 
two objects in the same direction but with different velocities, they follow two different trajectories? 
In relativity, what is the generalization of the “shortest distance between two points’? 

What kinds of general coordinate transformations are allowed in general relativity? 
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18. 
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In general relativity, what is a scalar? A 4-vector? A tensor? Which of the following is a scalar /vector / 
tensor /none-of-the-above? (a) a set of coordinates z”; (b) a coordinate interval dæ”; (c) proper time 
T? 

What does general covariance mean? 

What does parallel transport mean? 

Why is it important to define covariant derivatives that behave like tensors? 

Is covariant differentiation a derivation? That is, is covariant differentiation a linear operation, and does 
it obey the Leibniz rule for the derivative of a product? 

What is the covariant derivative of the metric tensor? Explain. 

What does a connection coefficient T, mean physically? Is it a tensor? Why, or why not? 

An astronaut is in free-fall in orbit around the Earth. Can the astronaut detect that there is a gravita- 
tional field? 


Can a gravitational field exist in flat space? 

How can you tell whether a given metric is equivalent to the Minkowski metric of flat space? 

How many degrees of freedom does the metric have? How many of these degrees of freedom can be 
removed by arbitrary transformations of the spacetime coordinates, and therefore how many physical 
degrees of freedom are there in spacetime? 

If you insist that the spacetime is spherical, how many physical degrees of freedom are there in the 
spacetime? 

If you insist that the spacetime is spatially homogeneous and isotropic (the cosmological principle), how 
many physical degrees of freedom are there in the spacetime? 

In general relativity, you are free to prescribe any spacetime (any metric) you like, including metrics 
with wormholes and metrics that connect the future to the past so as to violate causality. True or false? 
If it is true that in general relativity you can prescribe any metric you like, then why aren’t you bumping 
into wormholes and causality violations all the time? 

How much mass does it take to curve space significantly (significantly meaning by of order unity)? 
What is the relation between the energy-momentum 4-vector of a particle and the energy-momentum 
tensor? 

It is straightforward to go from a prescribed metric to the energy-momentum tensor. True or false? 

It is straightforward to go from a prescribed energy-momentum tensor to the metric. True or false? 
Does the principle of equivalence imply Einstein’s equations? 

What do Einstein’s equations mean physically? 

What does the Riemann curvature tensor R,,\,,, mean physically? Is it a tensor? 

The Riemann tensor splits into compressive (Ricci) and tidal (Weyl) parts. What do these parts mean, 
physically? 

Einstein’s equations imply conservation of energy-momentum, but what does that mean? 

Do Einstein’s equations describe gravitational waves? 

Do photons (massless particles) gravitate? 

How do different forms of mass-energy gravitate? 

How does negative mass gravitate? 


What’s important? 


. The postulates of general relativity. How do the various postulates imply the mathematical structure of 
general relativity? 

. The road from spacetime curvature to energy-momentum: 
metric guv 

— connection coefficients I", 

— Riemann curvature tensor R,,\j,0 

— Ricci tensor R,,,, and scalar R 

— Einstein tensor Grey = Rip — 49r R 

— energy-momentum tensor Thy 

. 4-velocity and 4-momentum. Geodesic equation. 

. Bianchi identities guarantee conservation of energy-momentum. 
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Fundamentals of General Relativity 


As of writing (2013), general relativity continues to beat all-comers in the Darwinian struggle to be top theory 
of gravity and spacetime (Will, 2005). Despite its success, most physicists accept that general relativity cannot 
ultimately be correct, because of the difficulty in reconciling it with that other pillar of physics, quantum 
mechanics. The other three known forces of Nature, the electromagnetic, weak, and colour (strong) forces, 
are described by renormalizable quantum field theories, the so-called Standard Model of Physics, that agree 
extraordinarily well with experiment, and whose predictions have continued to be confirmed by ever more 
precise measurements. Attempts to quantize general relativity in a similar fashion fail. The attempt to unite 
general relativity and quantum mechanics continues to exercise some of the brightest minds in physics. 

One place where general relativity predicts its own demise is at singularities inside black holes. What 
physics replaces general relativity at singularities? This is a deep question, providing one of the motivations 
for this book’s emphasis on black hole interiors. 

The aim of this Chapter is to give a condensed introduction to the fundamentals of general relativity, using 
the traditional coordinate-based approach to general relativity. The approach is neither the most insightful 
nor the most powerful, but it is the fastest route to connecting the metric to the energy-momentum content 
of spacetime. The Chapter does not attempt to convey a deep conceptual understanding, which I think is 
difficult to gain from the mathematics by itself. Later Chapters, starting with Chapter 7 on the Schwarzschild 
geometry, present visualizations intended to aid conceptual understanding. 

One of the drawbacks of the coordinate approach is that it works with frames that are aligned at each point 
with the tangent vectors e,, to the coordinates at that point. General relativity postulates the existence of 
locally inertial frames, so the coordinates at any point can always be arranged such that the tangent vectors 
at that one point are orthonormal, and the spacetime is locally flat (Minkowski) about that point. But 
in a curved spacetime it is impossible to arrange the coordinate tangent vectors e, to be orthonormal 
everywhere. Thus the coordinate approach inevitably presents quantities in a frame that is skewed compared 
to the natural, orthonormal frame. It is like looking at a scene with your eyes crossed. The problem is not so 
bad if the spacetime is empty of energy-momentum, as in the Schwarzschild and Kerr geometries for ideal 
spherical and rotating black holes, but it becomes a significant handicap in realistic spacetimes that contain 
energy-momentum. 

The coordinate approach is adequate to deal with ideal black holes, Chapter 6 to 9, and with the Friedmann- 
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Lemaitre-Robertson- Walker spacetime of a homogeneous, isotropic cosmology, Chapter 10. After that, the 
book restarts essentially from scratch. Chapter 11 introduces the tetrad formalism, the springboard for 
further explorations of gravity, black holes, and cosmology. 

The convention in this book is that greek (brown) dummy indices label curved spacetime coordinates, 
while latin (black) dummy indices label locally inertial (more generally, tetrad) coordinates. 


2.1 Motivation 


Special relativity was unsatisfactory almost from the outset. Einstein had conceived special relativity by 
abolishing the aether. Yet for something that had no absolute substance, the spacetime of special relativity 
had strikingly absolute properties: in special relativity, two particles on parallel trajectories would remain 
parallel for ever, just as in Euclidean geometry. 

Moreover whereas special relativity neatly accommodated the electromagnetic force, which propagated 
at the speed of light, it did not accommodate the other force known at the beginning of the 20th century, 
gravity. Plainly Newton’s theory of gravity could not be correct, since it posited instantaneous transmission 
of the gravitational force, whereas special relativity seemed to preclude anything from moving faster than 
light, Exercise 1.23. You might think that gravity, an inverse square law like electromagnetism, might satisfy 
a similar set of equations, but this is not so. Whereas an electromagnetic wave carries no electric charge, and 
therefore does not interact with itself, any wave of gravity must carry energy, and therefore must interact 
with itself. This proves to be a considerable complication. 

A partial solution, the principle of equivalence of gravity and acceleration, occurred to Einstein while 
working on an invited review on special relativity (Einstein, 1907). Einstein realised that “if a person falls 
freely, he will not feel his own weight,” an idea that Einstein would later refer to as “the happiest thought of 
my life.” The principle of equivalence meant that gravity could be reinterpreted as a curvature of spacetime. 
In this picture, the trajectories of two freely-falling particles that pass either side of a massive body are caused 


Figure 2.1 Particles initially on parallel trajectories passing either side of the Earth are caused to converge by the 
Earth’s gravity. According to Einstein’s principle of equivalence, the situation is equivalent to one where the particles 
are moving in straight lines in local free-fall frames. This allows the gravitational force to be reinterpreted as being 
produced by a curvature of spacetime induced by the presence of the Earth. 
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to converge not because of a gravitational force, but rather because the massive body curves spacetime, and 
the particles follow straight lines in the curved spacetime, Figure 2.1. 

Einstein’s principle of equivalence is only half the story. The principle of equivalence determines how 
particles must move in a spacetime of given curvature, but it does not determine how spacetime is itself 
curved by mass. That was a much more difficult problem, which Einstein took several more years to crack. 
The eventual solution was Einstein’s equations, the final version of which he set out in a presentation to the 
Prussian Academy at the end of November 1915 (Einstein, 1915). 

Contemporaneously with Einstein’s discovery, David Hilbert derived Einstein’s equations independently 
and elegantly from an action principle (Hilbert, 1915). In the present Chapter, Einstein’s equations are simply 
postulated, since their real justification is that they reproduce experiment and observation. A derivation of 
Einstein’s equations from the Hilbert action is deferred to Chapter 16. 


2.2 The postulates of General Relativity 


General relativity follows from three postulates: 
1. Spacetime is a 4-dimensional differentiable manifold; 
2. Einstein’s principle of equivalence; 
3. Einstein’s equations. 


2.2.1 Spacetime is a 4-dimensional differentiable manifold 


A 4-dimensional manifold is defined mathematically to be a topological space that is locally homeomorphic 
to Euclidean 4-space R*. A homeomorphism is a continuous map that has a continuous inverse. 

The postulate that spacetime is a 4-dimensional manifold means that it is possible to set up a coordinate 
system, possibly in patches, called charts, 


gh = {x0 a, x, 27} (2.1) 


such that each point of a chart of the spacetime has a unique coordinate. 

It is not always possible to cover a manifold with a single chart, that is, with a coordinate system such 
that every point of spacetime has a unique coordinate. A simple example of a 2-dimensional manifold that 
cannot be covered with a single chart is the 2-sphere S?, the 2-dimensional surface of a 3-dimensional sphere, 
as illustrated in Figure 2.2. Inevitably, lines of constant coordinate must cross somewhere on the 2-sphere. 
At least two charts are required to cover a 2-sphere. 

When more than one chart is necessary, neighbouring charts are required to overlap, in order that the 
structure of the manifold be consistent across the overlap. General relativity postulates that the mapping 
between the coordinates of overlapping charts be at least. doubly differentiable. A manifold subject to this 
property is called differentiable. 

In practice one often uses coordinate systems that misbehave at some points, but in an innocuous fashion. 
The 2-sphere again provides a classic example, where the standard choice of polar coordinates x“ = {0,6} 
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Figure 2.2 The 2-sphere is a 2-manifold, a topological space that is locally homeomorphic to Euclidean 2-space R2. 
Any attempt to cover the surface of a 2-sphere with a single chart, that is, with coordinates x and y such that each 
point on the sphere is specified by a unique coordinate {x, y}, fails at at least one point. In the left panel, a coordinate 
grid draped over the sphere fails at one point, the south pole, where coordinate lines cross. At least two charts are 
required to cover the surface of a 2-sphere, as illustrated in the middle panel, where one chart covers the north pole, 
the other the south pole. Where the two charts overlap, the two sets of coordinates are related differentiably. The right 
panel shows standard polar coordinates 0, ¢ on the 2-sphere. The polar coordinatization fails at the north and south 
poles, where lines of longitude cross, the azimuthal angle ¢ is not unique, and a person passing smoothly through the 
pole would see the azimuthal angle jump by m. Such misbehaving points, called coordinate singularities, are however 
innocuous: they can be removed by cutting out a patch around the coordinate singularity, and pasting on a separate 
chart. 


misbehaves at the north and south poles, Figure 2.2. A person passing smoothly through a pole sees the 
azimuthal coordinate jump discontinuously by a. This is called a coordinate singularity. It is innocuous 
because it can be removed by excising a patch around the pole, and pasting on a separate chart. 


2.2.2 Principle of equivalence 


The weak principle of equivalence states that: “Gravitating mass equals inertial mass.” General relativity 
satisfies the weak principle of equivalence, but then so also does Newtonian gravity. 

Einstein’s principle of equivalence is actually two separate statements: “The laws of physics in a 
gravitating frame are equivalent to those in an accelerating frame,’ and “The laws of physics in a non- 
accelerating, or free-fall, frame are locally those of special relativity.” 

Einstein’s principle of equivalence implies that it is possible to remove the effects of gravity locally by going 
into a non-accelerating, or free-fall, frame. The structure of spacetime in a non-accelerating, or free-fall, frame 
is locally inertial, with the local structure of Minkowski space. By locally inertial is meant that at each point 
of spacetime it is possible to choose coordinates such that (a) the metric at that point is Minkowski, and (b) 
the first derivatives of the metric are all zero!. In other words, Einstein’s principle of equivalence asserts the 
existence of locally inertial frames. 

1 Actually, general relativity goes a step further. The metric is the scalar product of coordinate tangent axes, equation (2.26). 


General relativity postulates, §2.10.1, that in a locally inertial frame the first derivatives not only of the metric, but also of 
the tangent axes themselves, vanish. See also Concept question 2.5. 
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Since special relativity is a metric theory, and the principle of equivalence asserts that general relativity 
looks locally like special relativity, general relativity inherits from special relativity the property of being a 
metric theory. A notable consequence is that the proper times and distances measured by an accelerating 
observer are the same as those measured by a freely-falling observer at the same point and with the same 
instantaneous velocity. 


2.2.3 Einstein’s equations 


Einstein’s equations comprise a 4 x 4 symmetric matrix of equations 


Gu T. 8rGT uv : (2.2) 


Here G is the Newtonian gravitational constant, Guy is the Einstein tensor, and T),,, is the energy- 
momentum tensor. 
Physically, Einstein’s equations signify 


(compressive part of) curvature = energy-momentum content . (2.3) 
Einstein’s equations generalize Poisson’s equation 
V°® = 4rGp (2.4) 


where ® is the Newtonian gravitational potential, and p the mass-energy density. Poisson’s equation is the 
time-time component of Einstein’s equations in the limit of a weak gravitational field and slowly moving 
matter, §2.27. 


2.3 Implications of Einstein’s principle of equivalence 


2.3.1 The gravitational redshift of light 


Einstein’s principle of equivalence implies that light will redshift in a gravitational field. In a weak gravita- 
tional field, the gravitational redshift of light can be deduced quantitatively from the equivalence principle 
without any further assumption (such as Einstein’s equations), Exercises 2.1 and 2.2. A fully general rel- 
ativistic treatment for the redshift between observers at rest in a stationary gravitational field is given in 
Exercise 2.9. 


Exercise 2.1. The equivalence principle implies the gravitational redshift of light, Part 1. A 
rigorous general relativistic version of this exercise is Exercise 2.10. A person standing at rest on the surface 
of the Earth is to a good approximation in a uniform gravitational field, with gravitational acceleration g. 
The principle of equivalence asserts that the situation is equivalent to that of a frame uniformly accelerating 
at g. Assume that the non-accelerating, free-fall frame is Minkowski to a good approximation. Define the 
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equivalent equivalent 
acceleration acceleration 
Š 
Ry 
gravity gravity 
A 


Figure 2.3 Einstein’s principle of equivalence implies the gravitational redshift of light, and the gravitational bending 
of light. In the left panel, persons A and B are at rest relative to each other in a uniform gravitational field. They 
are shown moving to the right to bring out the evolution of the system in time. A sends a beam of light upward to 
B. The principle of equivalence asserts that the uniform gravitational field is equivalent to a uniformly accelerating 
frame. The right panel shows the equivalent uniformly accelerating situation as perceived by a person in free-fall. In 
the free-fall frame, the light moves on a straight line, and has constant frequency. Back in the gravitating/accelerating 
frame in the left panel, the light appears to bend, and to redshift as it climbs from A to B. 


potential ® by the usual Newtonian formula g = —V®. Show that for small differences in their gravitational 
potentials, B perceives the light emitted by A to be redshifted by (with units restored) 
Pobs = Dom 
z = —— a (2.5) 


Exercise 2.2. The equivalence principle implies the gravitational redshift of light, Part 2. A 
rigorous general relativistic version of this exercise is Exercise 2.11. Consider a person who, at rest in 
Minkowski space, whirls a clock around them on the end of string, so fast that the clock is moving at near 
the speed of light. The person sees the clock redshifted by the Lorentz y-factor (the string is of fixed length, 
so the light travel time from clock to observer is always the same, and does not affect the redshift). Tugged 
on by the string, the clock experiences a centripetal acceleration towards the whirling person. According to 
the principle of equivalence, the centripetal acceleration is equivalent to a centrifugal gravitational force. In a 
Newtonian approximation, if the clock is whirling around at angular velocity w, then the effective centrifugal 
potential at radius r from the observer is 

= hw? r? ‘ (2.6) 
Show that, for non-relativistic velocities wr « c, the observer perceives the light emitted from the clock to 
be redshifted by (with units restored) 


pa (2.7) 
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2.3.2 The gravitational bending of light 


The principle of equivalence also implies that light will appear to bend in a gravitational field, as illustrated 
by Figure 2.3. However, a quantitative prediction for the bending of light requires full general relativity. The 
bending of light in a weak gravitational field is the subject of Exercise 2.17. 


2.4 Metric 


Postulate (1), §2.2.1, of general relativity means that it is possible to choose coordinates 


i 


x! = {x° a, 2”, x°} (2.8) 


covering a patch of spacetime. 
Postulate (2), §2.2.2, of general relativity implies that at each point of spacetime it is possible to choose 
locally inertial coordinates 


Se eae (2.9) 
such that the metric is Minkowski, 
ds? = nmn dE” AE” , (2.10) 


in an infinitesimal neighbourhood of the point. Infinitesimal neighbourhood means that the metric is the 
Minkowski metric jm, at the point, and that the first derivatives of the metric vanish at the point. The 
spacetime distance squared ds? is a scalar, a quantity that is unchanged by the choice of coordinates. 
Whereas in special relativity the Minkowski formula (1.32) for the spacetime distance As? held for finite 
intervals Av”, in general relativity the metric formula (2.10) holds only for infinitesimal intervals dé™. 

General relativity requires, postulate (1), that two sets of coordinates are differentiably related, so locally 
inertial intervals d€™ and coordinate intervals dx" are related by the Leibniz rule, 


ð m 
m _ H 
dé™ = Dar dx” . (2.11) 
It follows that the scalar spacetime distance squared is 
og™ Og” 
2 
ds” = mn ial aa dx'dx” , (2.12) 
which can be written in terms of coordinate intervals dz” as 
ds? = guy dz”da” |, (2.13) 
where g,,, is the metric, a 4 x 4 symmetric matrix 
O&™ OE” 
Juv = nmn Ox Oa . (2.14) 


The metric is the essential mathematical object that converts an infinitesimal interval dx” to a proper 
measurement of an interval of time or space. 
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Figure 2.4 (Left) The tetrad vectors ym form an orthonormal basis of vectors tangent to a set of locally inertial 
coordinates €™ at a point. (Right) The coordinate tangent vectors e, are the basis of vectors tangent to the coordinates 
at each point. The background square grid represents a locally inertial frame, the existence of which is asserted by 
general relativity. 


2.5 Timelike, spacelike, proper time, proper distance 


General relativity inherits from special relativity the physical meaning of the scalar spacetime distance 
squared ds? along an interval dz”. The scalar spacetime distance squared can be negative, zero, or positive, 
and accordingly timelike, lightlike, or spacelike: 


timelike: ds? <0, dr = ~V—ds? = interval of proper time , 
lightlike: ds? =0, (2.15) 
spacelike: ds? >0, dl = vds? = interval of proper distance . 


2.6 Orthonormal tetrad basis Ym 


You are familiar with the idea that in ordinary 3-dimensional Euclidean geometry it is often convenient to 
treat vectors in an abstract coordinate-independent formalism. Thus for example a 3-vector is commonly 
written as an abstract quantity r. The coordinates of the vector r may be {x,y,z} in some particular 
coordinate system, but one recognizes that the vector r has a meaning, a magnitude and a direction, that is 
independent of the coordinate system adopted. In an arbitrary Cartesian coordinate system, the Euclidean 
3-vector r can be expressed 


r=) bata =80t+Gy+2z (2.16) 
a 


where ĉa = {%, G, 2} are unit vectors along each of the coordinate axes. The unit vectors satisfy a Euclidean 
metric 


Bq By = Sap - (2.17) 


The same kind of abstract notation is useful in general relativity. Because the spacetime of general relativity 
is only locally inertial, not globally inertial, vectors must be thought of as living not in the spacetime manifold 
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itself, but rather in the tangent space of the manifold. The existence and structure of such a tangent space 
follows from the postulate of the existence of locally inertial frames. Let €™ be a set of locally inertial 
coordinates at a point of spacetime. Define the vectors Ym, called a tetrad, to be tangent to the locally 
inertial coordinates at the point in question, 


Ym = 1 V1; V2: V3} ; (2.18) 


as illustrated in the left panel of Figure 2.4. Each tetrad basis vector Ym is a 4-dimensional object, with 
both magnitude and direction. The basis vectors ym are introduced so that vectors in spacetime can be 
expressed in an abstract coordinate-independent fashion. The prototypical vector is an infinitesimal interval 
d&™ of spacetime, which can be expressed in coordinate-independent fashion as the abstract vector interval 
dx defined by 


dit = Ym dE™ = yo dé? + y1 dé + Y2 dE? + y3 dÊ? (2.19) 
The interval dé” transforms under a Lorentz transformation of the locally inertial coordinates as a con- 
travariant Lorentz vector. To make the abstract vector interval dx invariant under Lorentz transformation, 


the basis vectors ym must transform as a covariant Lorentz vector. 
The scalar length squared of the abstract vector interval dæ is 


ds? = dx - dt = Ym ` Yn dE” dE” . (2.20) 

Since this must reproduce the locally inertial metric (2.10), the scalar products of the tetrad vectors Ym 
must form the Minkowski metric 

Ym ` Yn = mn - (2.21) 


A basis of tetrad vectors whose scalar products form the Minkowski metric is called orthonormal. 
Tetrads are explored in depth in Chapter 11. 


2.7 Basis of coordinate tangent vectors e,, 


In general relativity, coordinates can be chosen arbitrarily, subject to differentiability conditions. In an 
arbitrary system of coordinates z”, the coordinate tangent vectors e, at each point, 


e, = {€0, €1, €2, 3} , (2.22) 

are defined to satisfy 
dx = e, dx” = Ym d&™ . (2.23) 
The letter e derives from the German word einheit, meaning unity. The relation (2.11) between coordinate 


intervals dx" and locally inertial coordinate intervals dé” implies that the coordinate tangent vectors e, 
must be related to the orthonormal tetrad vectors yy, by 


oem 
en = "in ge : (2.24) 
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Like the tetrad axes Ym, each coordinate tangent axis e,, is a 4-dimensional vector object, with both mag- 
nitude and direction, as illustrated in the right panel of Figure 2.4. 
The scalar length squared of the abstract vector interval dæ is 


ds* = dx - dx = e,- e, dxdx” , (2.25) 


from which it follows that the scalar products of the coordinate tangent axes e,, must equal the coordinate 
metric guv, 


Juv = En Ev | - (2.26) 


Like the orthonormal tetrad vectors Ym, the coordinate tangent vectors e,, form a basis for the 4- 
dimensional tangent space at each point. The tangent space has three basic mathematical properties. First, 
the tangent space is a vector space, that is, it has the properties of linearity that define a vector space. 
Second, the tangent space has an inner (or scalar) product, defined by the metric (2.26). That scalar product 
is a consequence of the postulated locally inertial, or Lorentz, structure of spacetime, which asserts that the 
metric is Minkowski nmn with respect to locally inertial coordinates €™. Third, vectors e,, in the tangent 
space can be differentiated with respect to coordinates x”, as will be elucidated in §2.9.3. 

Some texts represent the tangent vectors e,, with the notation O,, on the grounds that e, transforms 
like the coordinate derivatives ô, = 0/Ox". This notation is not used in this book, to avoid the potential 
confusion between ô, as a derivative and ô, as a vector. 


2.8 4-vectors and tensors 


2.8.1 Contravariant coordinate 4-vector 


Under a general coordinate transformation 


gh yall , (2.27) 
a coordinate interval dz” transforms as 
Oa!!! 
da!" = Dot dx” A (2.28) 
x£ 


In general relativity, a coordinate 4-vector is defined to be a quantity A” = {A}, A!, A?, A?” } that trans- 
forms under a coordinate transformation (2.27) like a coordinate interval 


Oa!!! 


Alt = A’). 2.29 
Da" (2.29) 


Just because something has an index on it does not make it a 4-vector. The essential property of a con- 
travariant coordinate 4-vector is that it transforms like a coordinate interval, equation (2.29). 
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2.8.2 Abstract 4-vector 
A 4-vector may be written in coordinate-independent fashion as 
A=e,A". (2.30) 


The quantity A is an abstract 4-vector. Although A is a 4-vector, it is by construction unchanged by a 
coordinate transformation, and is therefore a coordinate scalar. See §2.8.7 for commentary on the distinction 
between abstract and coordinate vectors. 


2.8.3 Lowering and raising indices 


Define g”” to be the inverse metric, satisfying 


Jan gh” = Oh = (2.31) 


O O co 
D O m 
D k oOo: 
a e a m Oo 


The metric g,, and its inverse g”” provide the means of lowering and raising coordinate indices. The 
components of a coordinate 4-vector A” with raised index are called its contravariant components, while 
those A,, with lowered indices are called its covariant components, 


Ay = Iu A” (2.32) 


A" =g” A, |. (2.33) 


2.8.4 Dual basis e” 


The contravariant dual basis elements e” are defined by raising the indices of the covariant tangent basis 


elements €,, 


e” = ge. (2.34) 


You can check that the dual vectors e” transform as a contravariant coordinate 4-vector. The dot products 
of the dual basis elements e” with each other are 


e”. e” = g”. (2.35) 
The dot products of the dual and tangent basis elements are 


ef -e, = 0}. (2.36) 
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2.8.5 Covariant coordinate 4-vector 


Under a general coordinate transformation (2.27), the covariant components A,, of a coordinate 4-vector 
transform as 

_ Og" 
bo Oglh  %" 
You can check that the transformation law (2.37) for the covariant components A,, is consistent with the 
transformation law (2.29) for the contravariant components A“. 

You can check that the tangent vectors e,, transform as a covariant coordinate 4-vector. 


(2.37) 


2.8.6 Scalar product 


If A” and B” are coordinate 4-vectors, then their scalar product is 
A B” = A" By = guy A" B” . (2.38) 


This is a coordinate scalar, a quantity that remains invariant under general coordinate transformations. 
The ability to form a scalar by contracting over paired indices, always one raised and one lowered, is what 
makes the introduction of two species of vector, contravariant (raised index) and covariant (lowered index), 
so advantageous. 

In abstract vector formalism, the scalar product of two 4-vectors A = e,,A” and B = e, B" is 


A-B=e,-e, A'B” =g, A'B” . (2.39) 


2.8.7 Comment on vector naming and notation 


Different texts follow different conventions for naming and notating vectors and tensors. 

This book follows the convention of calling both A” (with a dummy index y) and A = A”e, vectors. 
Although A” and A are both vectors, they are mathematically different objects. 

If the index on a vector indicates a specific coordinate, then the indexed vector is the component of the 
vector; for example A? (or A‘) is the x° (or time t) component of the coordinate 4-vector A”. 

In this book, the different species of vector are distinguished by an adjective: 

1. A coordinate vector A”, identified by greek (brown) indices ju, is one that changes in a prescribed 
way under coordinate transformations. A coordinate transformation is one that changes the coordinates 
of the spacetime without actually changing the spacetime or whatever lies in it. 

2. An abstract vector A, identified by boldface, is the thing itself, and is unchanged by the choice of 
coordinates. Since the abstract vector is unchanged by a coordinate transformation, it is a coordinate 
scalar. 

All the types of vector have the properties of linearity (additivity, multiplication by scalars) that identify 
them mathematically as belonging to vector spaces. The important distinction between the types of vector 
is how they behave under transformations. 

In referring to both A” and A as vectors, this book follows the standard physics practice of mentally 
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regarding A” and A as equivalent objects. You are familiar with the advantages of treating a vector in 
3-dimensional Euclidean space either as an abstract vector A, or as a coordinate vector Aa. Depending on 
the problem, sometimes the abstract notation A is more convenient, and sometimes the coordinate notation 
A, is more convenient. Sometimes it’s convenient to switch between the two in the middle of a calculation. 
Likewise in general relativity it is convenient to have the flexibility to work in either coordinate or abstract 
notation, whatever suits the problem of the moment. 


2.8.8 Coordinate tensor 


In general, a coordinate tensor a is an object that transforms under general coordinate transforma- 
tions (2.27) as 


; Ox!" Ox'* Ox? Oa™ 
KX... TM Piss 
A w= OLT Oxe Ox't Ox” Aa | one 


You can check that the metric tensor g,,, and its inverse g”” are indeed coordinate tensors, transforming 
like (2.40). 

The rank of a tensor is the number of indices of its expansion a in components. A scalar is a tensor 
of rank 0. A 4-vector is a tensor of rank 1. The metric and its inverse are tensors of rank 2. The rank of a 


tensor with n contravariant (upstairs) and m covariant (downstairs) indices is sometimes written 


2.9 Covariant derivatives 


2.9.1 Derivative of a coordinate scalar 


Suppose that ® is a coordinate scalar. Then the coordinate derivative of ® is a coordinate 4-vector 


O® 
a coordinate tensor (2.41) 
ox! 
transforming like equation (2.37). 


As a shorthand, the ordinary partial derivative is often denoted in the literature with a comma 
O® 
Ox! 


For the most part this book does not use the comma notation. 


=o. (2.42) 


2.9.2 Derivative of a coordinate 4-vector 


The ordinary partial derivative of a contravariant coordinate 4-vector A“ is not a tensor 
OA" 


Dal not a coordinate tensor (2.43) 
x 
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Figure 2.5 The change deo in the tangent vector eo over a small interval 6x! of spacetime is defined to be the difference 
between the tangent vector e9(x'+6z!') at the shifted position xt +652! and the tangent vector eo(x!) at the original 
position x1, parallel-transported to the shifted position. The parallel-transported vector is shown as a dashed arrowed 
line. The parallel transport is defined with respect to a locally inertial frame, shown as a background square grid. 


because it does not transform like a coordinate tensor. 
However, the 4-vector A = e,,A”, being by construction invariant under coordinate transformations, is a 
coordinate scalar, and its partial derivative is a coordinate 4-vector 


OA _ ðe, A" 
ðr” Oe” 
DAM | dey 


H ib : 
= €,,—— A” a coordinate tensor . 2.44 
t ðr” Ər” Laa 
The last line of equation (2.44) assumes that it is legitimate to differentiate the tangent vectors e,,, but 
what does that mean? The partial derivatives of basis vectors e,, are defined in the usual way by 
Oe _ a epla? y FOL” jang LI) CN ne ong S| 


= ] 2.4 
dz’ ~ Szto a” (2an) 


This definition relies on being able to compare the vectors e„(x) at some point x with the vectors e„(x+ôx) 
at another point x+ôx a small distance away. The comparison between two vectors a small distance apart 
is made possible by the existence of locally inertial frames. In a locally inertial frame, two vectors a small 
distance apart can be compared by parallel-transporting one vector to the location of the other along 
the small interval between them, that is, by transporting the vector without accelerating or precessing with 
respect to the locally inertial frame. Thus the right hand side of equation (2.45) should be interpreted as 
e,,(c+6x) minus the value of e,,(z) parallel-transported from position x to position x+dzx along the small 
interval da between them, as illustrated in Figure 2.5. 

The notion of the tangent space at a point on a manifold was introduced in §2.6. Parallel transport allows 
the tangent spaces at neighbouring points to be adjoined in a well-defined fashion to form the tangent 
manifold, whose dimension is twice that of the underlying spacetime. Coordinates for the tangent manifold 
are provided by a combination {x“,€™} of coordinates x“ on the parent manifold and tangent space coor- 
dinates €™ extrapolated from a locally inertial frame about each point. The tangent space coordinates €™” 
vary smoothly over the manifold provided that the locally inertial frames are chosen to vary smoothly. 
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2.9.3 Coordinate connection coefficients 


The partial derivatives of the basis vectors e,, that appear on the right hand side of equation (2.44) define 


the coordinate connection coefficients I", 


e, | not a coordinate tensor . (2.46) 


The definition (2.46) shows that the connection coefficients express how each tangent vector e,, changes, 
relative to parallel-transport, when shifted along an interval dx”. 


2.9.4 Covariant derivative of a contravariant 4-vector 


Expression (2.44) along with the definition (2.46) of the connection coefficients implies that 


OA OA" i 
— T“ „AÀ 
ðr” z Âx” T HY 
OA" : 
=e, | ~_— +T*,A* a coordinate tensor . (2.47) 
om j 


The expression in parentheses is a coordinate tensor, and defines the covariant derivative D, A" of the 
contravariant coordinate 4-vector A" 


ðA" 
Ox” 


D,A" = +T% ,A”| a coordinate tensor . (2.48) 


As a shorthand, the covariant derivative is often denoted in the literature with a semi-colon 
D,A" = Al, . (2.49) 


For the most part this book does not use the semi-colon notation. 


2.9.5 Covariant derivative of a covariant coordinate 4-vector 


Similarly, 
OA 
Ox v 


=e"D,A, a coordinate tensor (2.50) 


where D,,A,, is the covariant derivative of the covariant coordinate 4-vector A,, 


OA,, 


DA, = 
Ox” 


— TŻ, A, | a coordinate tensor . (2.51) 
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2.9.6 Covariant derivative of a coordinate tensor 


In general, the covariant derivative of a coordinate tensor is 


OA. 
DA = ae TAPAE HTA AR en PASA AR — oo (2.52) 


with a positive [ term for each contravariant index, and a negative I term for each covariant index. 


Concept question 2.3. Does covariant differentiation commute with the metric? Answer. Yes, 
essentially by construction. The covariant derivative of a tangent basis vector e,,, 


e, =0, (2.53) 


vanishes by definition of the coordinate connections, equation (2.46). Consequently the covariant derivative of 
the metric guv = €p e, also vanishes. As a corollary, covariant differentiation commutes with the operations 
of raising and lowering indices, and of contraction. 


2.10 Torsion 


2.10.1 No-torsion condition 


The existence of locally inertial frames requires that it must be possible to arrange not only that the tangent 
axes e, are orthonormal at a point, but also that they remain orthonormal to first order in a Taylor expansion 
about the point. That is, it must be possible to choose the coordinates such that the tangent axes e,, are 
orthonormal, and unchanged to linear order: 


Ey €y = Nw » (2.54a) 
OEE iy. (2.54b) 
Ox” 


In view of the definition (2.46) of the connection coefficients, the second condition (2.54b) is equivalent to 
the vanishing of all the connection coefficients: 


r= 0, (2.55) 


Under a general coordinate transformation x — x’, the tangent axes transform as e,, = Ox'"/Ox" el. 
The 4x 4 matrix Ox" /Ox" of partial derivatives provides 16 degrees of freedom in choosing the tangent axes 
at a point. The 16 degrees of freedom are enough — more than enough — to accomplish the orthonormality 
condition (2.54a), which is a symmetric 4 x 4 matrix equation with 10 degrees of freedom. The additional 
16 — 10 = 6 degrees of freedom are Lorentz transformations, which rotate the tangent axes e,,, but leave the 
metric Nuy unchanged. 

Just as it is possible to reorient the tangent axes e, at a point by adjusting the matrix Ox’"/Ox" of first 


2.10 Torsion 67 


partial derivatives of the coordinate transformation x — 2’, so also it is possible to reorient the derivatives 
de,,/Ox” of the tangent axes by adjusting the matrix 0?2'"/OxOx" of second partial derivatives of the 
coordinate transformation. The second partial derivatives comprise a set of 4 symmetric 4 x 4 matrices, for 
a total of 4 x 10 = 40 degrees of freedom. However, there are 4 x 4 x 4 = 64 connection coefficients I", 
all of which the condition (2.55) requires to vanish. The matrix of second derivatives is thus 64 — 40 = 24 
degrees of freedom short of being able to make all the connections vanish. The resolution of the problem 
is that, as shown below, equation (2.58), there are 24 combinations of the connections that form a tensor, 
the torsion tensor. If a tensor is zero in one frame, then it is automatically zero in any other frame. Thus 
the requirement that all the connections vanish requires that the torsion tensor vanish. This requires, from 
the expression (2.58) for the torsion tensor, the no-torsion condition that the connection coefficients are 
symmetric in their last two indices 


T= 7%, ||. (2.56) 


pv vu 


It should be emphasized that the condition of vanishing torsion is an assumption of general relativity, not 
a mathematical necessity. It has been shown in this section that torsion vanishes if and only if spacetime is 
locally flat, meaning that at any point coordinates can be found such that conditions (2.54) are true. The 
assumption of local flatness is central to the idea of the principle of equivalence. But it is an assumption, 
not a consequence, of the theory. 


Concept question 2.4. Parallel transport when torsion is present. If torsion does not vanish, then 
there is no locally inertial frame. What does parallel-transport mean in such a case? Answer. A general 
coordinate transformation can always be found such that the connection coefficients T, vanish along any 
one direction v. Parallel-transport along that direction can be defined relative to such a frame. For any given 
direction v, there are 16 second partial derivatives 0?" /Ox"Ox", just enough to make vanish the 4 x 4 = 16 


+ K 
coefficients T%,. 


2.10.2 Torsion tensor 


General relativity assumes no torsion, but it is possible to consider generalizations to theories with torsion. 
The torsion tensor S”, is defined by the commutator of the covariant derivative acting on a scalar ® 


ID., Di] ® = S" 2 


nA gE a coordinate tensor . (2.57) 


Note that the covariant derivative of a scalar is just the ordinary derivative, D,&® = 06/0x*. The expres- 
sion (2.51) for the covariant derivatives shows that the torsion tensor is 


SÉ =T% — T% „| a coordinate tensor (2.58) 


which is evidently antisymmetric in the indices «KA. 
In Einstein-Cartan theory, the torsion tensor is related to the spin content of spacetime. Since this vanishes 
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in empty space, Einstein-Cartan theory is indistinguishable from general relativity in experiments carried 
out in vacuum. See §16.11 for more on Einstein-Cartan theory. 


2.11 Connection coefficients in terms of the metric 


The connection coefficients have been defined, equation (2.46), as derivatives of the tangent basis vectors e,,. 
However, the connection coefficients can be expressed purely in terms of the (first derivatives of the) metric, 
without reference to the individual basis vectors. The partial derivatives of the metric are 


Ogrn _ Oey ey 


Ox” Ox” 
ðe, Oey 
= en Fao +n” Feo 
= ez 4] iy + €n Cal xy 
= Dr ie + Jur Djy 
= Dp T Puav , (2.59) 


which is a sum of two connection coefficients. Here Tapy with all indices lowered is defined to be Fiv with 
the first index lowered by the metric, 


DR = IarT hiv : (2.60) 
Combining the metric derivatives in the following fashion yields an expression for a single connection, 


ONEM j gav guv 
Ox” | Axl Ox* 


= Tawy + Paap Pvp H Tàu Tuva Popa 
=2 Dy T Syw = Suv = SoA ) (2.61) 
with Shuv = 9).57,, which shows that, in the presence of torsion, 


K 
pv? 


1 ( Ogdn Ogxy Ogu 
2\ ðr” ` ðr” Ox 


If torsion vanishes, as general relativity assumes, then 


T = 1 ONN i gyv guv 
Awe g (ðr re Ox 


Paw = + Sapo + Suva + Sus] not a coordinate tensor . (2.62) 


not a coordinate tensor . (2.63) 


This is the formula that allows connection coefficients to be calculated from the metric. 


2.12 Torsion-free covariant derivative 


Einstein’s principle of equivalence postulates that a locally inertial frame exists at each point of spacetime, 
and this implies that torsion vanishes. However, torsion is of special interest as a generalization of general 
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relativity because, as discussed in §2.19.2, the torsion tensor and the Riemann curvature tensor can be re- 
garded as fields associated with local gauge groups of respectively displacements and Lorentz transformations. 
Together displacements and Lorentz transformations form the Poincaré group of symmetries of spacetime. 
Spinor (spin-5) fields inevitably generate torsion, Exercise 16.5, but torsion is local and non-propagating, 
and cancels between oppositely aligned spins, so in practice is negligible in almost all circumstances, §16.11. 

The torsion-free part of the covariant derivative is a covariant derivative even when torsion is present (that 
is, it yields a tensor when acting on a tensor). The torsion-free covariant derivative is important, even when 
torsion is present, for several reasons. Firstly, as will be discovered from an action principle in Chapter 4, 
the covariant derivative that goes in the geodesic equation (2.88) is the torsion-free covariant derivative, 
equation (2.90). Secondly, the torsion-free covariant curl defines the exterior derivative in the theory of 
differential forms, §15.6. The exterior derivative has the property that it is inverse to integration over curved 
hypersurfaces. Integration is central to various aspects of general relativity, such as the development of 
Lagrangian and Hamiltonian mechanics. Thirdly, the Lie derivative, §7.34, is a covariant derivative defined 
in terms of torsion-free covariant derivatives. Finally, Yang-Mills gauge symmetries, such as the U(1) gauge 
symmetry of electromagnetism, require the gauge field to be defined in terms of the torsion-free covariant 
derivative, in order to preserve the gauge symmetry. 

When torsion is present and it is desirable to make the torsion part explicit, it is convenient to distinguish 
torsion-free quantities with a ° overscript. The torsion-free part È àuv Of the connection, also called the Levi- 
Civita connection, is given by the right hand side of equation (2.63). When expressed in a coordinate 
frame (as opposed to a tetrad frame, §11.15), the components of the torsion-free connections Ts are also 
called Christoffel symbols. Sometimes, the components Î duv With all indices lowered are called Christoffel 
symbols of the first kind, while components ly with first index raised are called Christoffel symbols of the 
second kind. There is no need to remember the jargon, but it is useful to know what it means if you meet it. 

The torsion-full connection Papy is a sum of the torsion-free connection Tigy and a tensor called the 
contortion tensor (not contorsion!) K),.,, 


Tiup = Dau + Ky, , not a coordinate tensor . (2.64 
From equation (2.62), the contortion tensor K),,, is related to the torsion tensor Syuy by 


Kyu = £ (Srpv + Suva + Supa) = — Svan + 251.1 a coordinate tensor . (2.65 


The contortion K),, is antisymmetric in its first two indices, 


Ky = —Kyyv ry (2.66 


and thus like the torsion tensor S),,, has 6 x 4 = 24 degrees of freedom. The torsion tensor $},,,, can be 
expressed in terms of the contortion tensor K),,,, 


Sauv = Kyyv — Kyvp = — Kuva +3 Ky] a coordinate tensor . (2.67) 


The torsion-full covariant derivative D, differs from the torsion-free covariant derivative D, by the con- 
tortion, 


D,A" = D,A" + Kj,A" a coordinate tensor . (2.68) 
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In this book torsion will not be assumed automatically to vanish, and thus by default the symbol D, will 
denote the torsion-full covariant derivative. When torsion is assumed to vanish, or when D, denotes the 
torsion-free covariant derivative, it will be explicitly stated so. 


Concept question 2.5. Can the metric be Minkowski in the presence of torsion? In §2.10.1 it was 
argued that the postulate of the existence of locally inertial frames implies that torsion vanishes. The basis 
of the argument was the proposition that derivatives of the tangent axes vanish, equation (2.54b). Impose 
instead the weaker condition that the derivatives of the metric (i.e. scalar products of tangent axes) vanish, 


Jau = 

ðr” 
Can torsion be non-vanishing under this weaker condition? Answer. Yes. In fact torsion may exist even 
in flat (Minkowski) space, where the metric is everywhere Minkowski, g),, = Nau- The condition (2.69) of 
vanishing metric derivatives is equivalent to the vanishing of the torsion-free connections, 


(2.69) 


1 Og o o 
2 Da" = Pony = Fouw + Kou = Dey =0. (2.70) 


Thus the condition (2.69) of vanishing metric derivatives imposes no condition on torsion. 


Exercise 2.6. Covariant curl and coordinate curl. Show that the covariant curl of a covariant vector 


Ay is 
OA, OA, 
D,,A\, — D\ Ay = = - St Aue 2.71 
a à Ox® Ox Se ( ) 
Conclude that the coordinate curl of a vector equals its torsion-free covariant curl, 
o o OA, OA, 
D,,A, — D\ Ay = = — : 2.72 
2 : Or" — Oa J 


Of course, if torsion vanishes as general relativity assumes, then the covariant curl is the torsion-free covariant 
curl. Note that since both DA — D)A, on the left hand side and SE Apu on the right hand side of 
equation (2.71) are both tensors, it follows that the coordinate curl 3A, /3x" — 0A,,/Ox* is a tensor even in 
the presence of torsion. 


Exercise 2.7. Covariant divergence and coordinate divergence. Show that the covariant divergence 
of a contravariant vector A“ is 


Jg Owl 


where g = |g,,,| is the determinant of the metric matrix. Conclude that the torsion-free covariant divergence 


1 O(V=gA" 
D,,A# = ee) + SY AP (2.73) 


is 
2 1 O(./—-gA") 
Pen he 
DM (2.74) 


Of course, if torsion vanishes as general relativity assumes, then the covariant divergence is the torsion-free 
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covariant divergence. Note that since both the covariant divergence on the left hand side of equation (2.73) 
and the torsion term on the right hand side of equation (2.73) are both tensors, the torsion-free covariant 
divergence (2.74) is a tensor even in the presence of torsion. 

Solution. The covariant divergence is 


OA" 
a V AL 
DAY = = t (2.75) 
From equation (2.62), 
v 1 vA Jav v 
Piy 39 Oak! T Sav 
= Oln | V —g]| v 
= ~ pze + Diag p (2.76) 


The second line of equations (2.76) follows because for any matrix M, the variation of the logarithm of its 
determinant is 


ôln|M]| = ln |M +ôM|—1n|M]| 
= ln|MT'(M + 8M)| 


=ln|1 + M7t8M]| 
= ln(1 + Tr M~'6M) 
= Tr M-16M . (2.77) 
The torsion-free covariant divergence is 
è OA" P 
—_ v Al 
D,A" = aah Lee (2.78) 
where the torsion-free coordinate connection is 
o 1 aga  Oln|/—g| 
Î” = vÀ = : 2.79 
m=] Ban Ox! oe) 


Concept question 2.8. If torsion does not vanish, does torsion-free covariant differentiation 
commute with the metric? Answer. Yes. Unlike the torsion-full covariant derivative, Concept Ques- 
tion 2.3, the torsion-free covariant derivative of the tangent basis vectors e, does not vanish, but rather 
depends on the contortion Kye, 


Den = Dyes + Kpy,ev = Kiev - (2.80) 


However, the torsion-free covariant derivative of the metric, that is, of scalar products of the tangent basis 
vectors, does vanish, 


DuGnr = Dyle: ex) = Kk êv: ex + KX yen: Cv = Kanu t+ Keay = 9 , (2.81) 
thanks to the antisymmetry of the contortion tensor in its first two indices. As a corollary, torsion-free 


covariant differentiation commutes with the operations of raising and lowering indices, and of contraction. 
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2.13 Mathematical aside: What if there is no metric? 


General relativity is a metric theory. Many of the structures introduced above can be defined mathematically 
without a metric. For example, it is possible to define the tangent space of vectors with basis e,,, and to 
define a dual vector space with basis e” such that e” - e, = ô}, equation (2.36). Elements of the dual vector 
space are called covectors. Similarly it is possible to define connections and covariant derivatives without a 
metric. However, this book follows general relativity in assuming that spacetime has a metric. 


2.14 Coordinate 4-velocity 


Consider a particle following a worldline 


z” (T), (2.82) 
where 7 is the particle’s proper time. The proper time along any interval of the worldline is dr = /—ds?. 
Define the coordinate 4-velocity u” by 
dx” : 
u” = a |? coordinate 4-vector . (2.83) 
T 


The magnitude squared of the 4-velocity is constant 


dz” dx” d3? 
a = = — 1 $ 2.84 
“p Iw gr dr dr? ( ) 
The negative sign arises from the choice of metric signature: with the signature —+++ adopted here, there 
is a — sign between ds? and dr?. Equation (2.84) can be regarded as an integral of motion associated with 


conservation of particle rest mass. 


2.15 Geodesic equation 


Let u = e,,u be the 4-velocity in coordinate-independent notation. The principle of equivalence (which 
imposes vanishing torsion) implies that the geodesic equation, the equation of motion of a freely-falling 
particle, is 


du 
— (|. 2. 
T (0) (2.85) 


Why? Because du/dr = 0 in the particle’s own free-fall frame, and the equation is coordinate-independent. 
In the particle’s own free-fall frame, the particle’s 4-velocity is u” = {1,0,0,0}, and the particle’s locally 
inertial axes e, = {e9, €1, €2, e3} are constant. 


2.16 Coordinate 4-momentum 


What does the equation of motion look like in coordinate notation? The acceleration is 


du dx” Ou 
dr dr Ox” 
=u e,D,u 
V Ou" K 
=U Cx (= +13," ) 
du" K Lp VY 
wa (Lense) 
The geodesic equation is then 
du” K V 
i +T uu” =0 


Another way of writing the geodesic equation is 


Du” 
= 0 > 
Dr 
where D/Dr is the covariant proper time derivative 
D 
Der 
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(2.86) 


(2.87) 


(2.88) 


(2.89) 


The above derivation of the geodesic equation invoked the principle of equivalence, which postulates that 
locally inertial frames exist, and thus that torsion vanishes. What happens if torsion does not vanish? In 
Chapter 4, equation (4.15), it will be shown from an action principle that in the presence of torsion, the 
covariant derivative in the geodesic equation should simply be replaced by the torsion-free covariant derivative 


D/Dr= ut Dy, 


o 


Du” 


Dr 


Thus the geodesic motion of particles is unaffected by the presence of torsion. 


2.16 Coordinate 4-momentum 


The coordinate 4-momentum of a particle of rest mass m is defined to be 


dat : 
p“ = mu" = m—— | a coordinate 4-vector . 


dr 


The momentum squared is, from equation (2.84), 


2 


L 2 
pup! = Mupu” = —m 


(2.90) 


(2.91) 


(2.92) 


minus the square of the rest mass. Again, the minus sign arises from the choice —+++ of metric signature. 
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2.17 Affine parameter 


For photons, the rest mass is zero, m = 0, but the 4-momentum p” remains finite. Define the affine 
parameter à by 


T 


A= a coordinate scalar (2.93) 


m 
which remains finite in the limit m — 0. The affine parameter A is unique up to an overall linear transfor- 
mation (that is, a\ + is also an affine parameter, for constant a and 8), because of the freedom in the 
choice of mass m and the zero point of proper time 7. In terms of the affine parameter, the 4-momentum is 


dz” 
K= —— 2.94 
p= (2.94) 
The geodesic equation is then in coordinate-independent notation 
dp 
— =0, 2.95 
or in component form 
dp" : 
— +r“ pp =0 2.96 
dà + pvP p , ( ) 
which works for massless as well as massive particles. 
Another way of writing this is 
Dp" 
=0 2.97 
Tei (2.97) 
where D/DA is the covariant affine derivative 
D 
— =p D,. 2.98 
DA ( ) 


In the presence of torsion, the connection in the geodesic equation (2.96) should be interpreted as the 
torsion-free connection I’, and the covariant derivative in equations (2.97) and (2.98) are torsion-free 
covariant derivatives. 


2.18 Affine distance 


The freedom in the overall scaling of the affine parameter can be removed by setting it equal to the proper 
distance near the observer in the observer’s locally inertial rest frame. With the scaling fixed in this fashion, 
the affine parameter is called the affine distance, so called because it provides a measure of distance along 
null geodesics. When you look at a scene with your eyes, you are looking along null geodesics, and the natural 
measure of distance to objects that you see is the affine distance (Hamilton and Polhemus, 2010). 

In special relativity, the affine distance coincides with the perceived (e.g. binocular) distance to objects. 
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Exercise 2.9. Gravitational redshift in a stationary metric. Let z” = {t,x°} constitute time t 
and spatial coordinates x“ of a spacetime. The metric g,,, is said to be stationary if it is independent of 
the coordinate t. A comoving observer in the spacetime is one that is at rest in the spatial coordinates, 
dx° /dr = 0. 

1. Argue that the coordinate 4-velocity u” = dx” /dr of a comoving observer in a stationary spacetime is 


1 
yV T Itt l 


2. Argue that the proper energy E of a particle, massless or massive, with energy-momentum 4-vector p” 


u” = {7,0,0,0}, y= (2.99) 


seen by a comoving observer with 4-velocity u”, equation (2.99), is 
E=-u'p, . (2.100) 


3. Consider a particle, massless or massive, that follows a geodesic between two comoving observers. Since 
the metric is independent of the time coordinate t, the covariant momentum p; is a constant of motion, 
equation (4.50). Argue that the ratio Eops/ Fem of the observed to emitted energies between two comoving 
observers is 

Eos = Yobs 
Eem E Yem l 


(2.101) 


4. Can comoving observers exist where g is positive? 


Exercise 2.10. Gravitational redshift in Rindler space. Rindler space is Minkowski space expressed in 
the coordinates of uniformly accelerating observers, called Rindler observers. Rindler observers are precisely 
the observers in the right quadrant of the spacetime wheel, Figure 1.14. 
1. Start with Minkowski space in a Cartesian coordinate system {t, x, y, z}. Define Rindler coordinates a, | 
by 


t=Ilsinha, «=IlIcosha. (2.102) 
Show that the line-element in Rindler coordinates is 
ds? = —[?da? + dl? + dy” + dz? . (2.103) 


2. A Rindler observer is a comoving observer in Rindler space, one who follows a worldline of constant. l, 
y, and z. Since Rindler spacetime is stationary, conclude that the ratio Eobs/Eem of the observed to 
emitted energies between two Rindler observers is, equation (2.101), 


obs EN lem 


= : 2.104 
Eom lobs ( i ) 


3. Can Rindler space be considered equivalent to a spacetime containing a uniform gravitational field? Do 
Rindler observers all accelerate at the same rate? 
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Exercise 2.11. Gravitational redshift in a uniformly rotating space. Start with Minkowski space in 
cylindrical coordinates {t, r, ¢, z}, 


ds? = — dt? + dr? + r7d¢? + dz”. (2.105) 
Define a uniformly rotating azimuthal angle x by 
Y= o-at, (2.106) 


which is constant for observers who are at rest in a system rotating uniformly at angular velocity w. The 
line-element in uniformly rotating coordinates is 


ds? = — dt? + dr? + r° (dx + w dt)? + d2? . (2.107) 


1. A comoving observer in the uniformly rotating system follows a worldline at constant r, x, and z. Since 
the uniformly rotating spacetime is stationary, conclude that the ratio Eobs/Eem of the observed to 
emitted energies between two comoving observers is, equation (2.101), 


Eobs _ Yem 
Eem Yobs , 


(2.108) 


where 
1 


Vom Vien 5 V=UurT. (2.109) 


2. What happens where v > 1? 


Concept question 2.12. Can Minkoswki space rotate? Exercise 2.11 considered Minkowski space in 
rotating coordinates. Can Minkowski space rotate globally? Answer. No. General relativity allows arbitrary 
choices of coordinates, including choices that allow physical objects to move through the coordinates faster 
than light. However, the choice of coordinates does not affect physical observables in any way. The metric 
encodes locally inertial frames, determining what intervals are timelike, lightlike, or spacelike (ds? less than, 
equal to, or greater than zero). That locally inertial structure is independent of the choice of coordinates. 
Objects cannot move through locally inertial frames than light. Thus Minkoswki spacetime does not rotate 
globally, regardless of the choice of coordinates. 


2.19 Riemann tensor 


2.19.1 Riemann curvature tensor 


The Riemann curvature tensor R,;),,, is defined by the commutator of the covariant derivative acting 
on a 4-vector. In the presence of torsion, 


[D.., Da] Ap = Sky Dr Ap + RedpvA” a coordinate tensor . (2.110) 
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If torsion vanishes, as general relativity assumes, then the definition (2.110) reduces to 


[Dx Da] Ay = Rkàuv A” | a coordinate tensor . (2.111) 


The expression (2.51) for the covariant derivative yields the following formula for the Riemann tensor in 
terms of connection coefficients 


= co) rN OP uvr 
Rean = pyn gh 


tPA Daven — Dis rva | a coordinate tensor . (2.112) 


This is the formula that allows the Riemann tensor to be calculated from the connection coefficients. The 
same formula (2.112) remains valid if torsion does not vanish, but the connection coefficients I), themselves 
are given by (2.62) in place of (2.63). 

In flat (Minkowski) space, covariant derivatives reduce to partial derivatives, D,, > 0/0x", and 


o ð 


[DeD > [som an 


| =0 in flat space (2.113) 


so that R.A, = 0 in flat space. 


Exercise 2.13. Derivation of the Riemann tensor. Confirm expression (2.112) for the Riemann tensor. 
This is an exercise that any serious student of general relativity should do. However, you might like to defer 
this rite of passage to Chapter 11, where Exercises 11.3—11.6 take you through the derivation and properties 
of the tetrad-frame Riemann tensor. 


2.19.2 Commutator of the covariant derivative acting on a general tensor 


The commutator of the covariant derivative is of fundamental importance because it defines what is meant 
by the field in gauge theories. 

It has seen above that the commutator of the covariant derivative acting on a scalar defined the torsion 
tensor, equation (2.57), which general relativity assumes vanishes, while the commutator of the covariant 
derivative acting on a vector defined the Riemann tensor, equation (2.111). Does the commutator of the 
covariant derivative acting on a general tensor introduce any other distinct tensor? No: the torsion and 
Riemann tensors completely define the action of the commutator of the covariant derivative on any tensor. 
Acting on a general tensor, the commutator of the covariant derivative is 


[Dis Da] AR = SaDo Apoi + ay AG + Rew A — Rae A Ree AD |< (2-114) 


po... pV... 


(Tan 


In more abstract notation, the commutator of the covariant derivative is the operator 
[Dy Da] = Ska Du + Rar (2.115) 


where the Riemann curvature operator ÊR,„x is an operator whose action on any tensor is specified by equa- 
tion (2.114). The action of the operator R,,, is analogous to that of the covariant derivative (2.52): there’s 
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a positive R term for each covariant index, and a negative R term for each contravariant index. The action 
of Êx on a scalar is zero, which reflects the fact that a scalar is unchanged by a Lorentz transformation. 

The general expression (2.114) for the commutator of the covariant derivative reveals the meaning of the 
torsion and Riemann tensors. The torsion and Riemann tensors describe respectively the displacement and the 
Lorentz transformation experienced by an object when parallel-transported around a curve. Displacements 
and Lorentz transformations together constitute the Poincaré group, the complete group of symmetries of 
flat spacetime. 

How can an object detect a displacement when parallel-transported around a curve? If you go around 
a curve back to the same coordinate in spacetime where you began, won’t you necessarily be at the same 
position? This is a question that goes to heart of the meaning of spacetime. To answer the question, you 
have to consider how fundamental particles are able to detect position, orientation, and velocity. Classically, 
particles may be structureless points, but quantum mechanically, particles possess frequency, wavelength, 
spin, and (in the relativistic theory) boost, and presumably it is these properties that allow particles to 
“measure” the properties of the spacetime in which they live. For example, a Dirac spinor (relativistic spin-4 
particle) Lorentz transforms under the fundamental (spin- 5) representation of the Lorentz group, and is 
thus endowed with precisely the properties that allow it to “measure” boost and rotation, §14.10. The Dirac 
wave equation shows that a Dirac spinor propagating through spacetime varies as ~ e’?«”', whose phase 
encodes the displacement of the Dirac spinor. Thus a Dirac spinor could potentially detect a displacement 
through a change in its phase when parallel-transported around a curve back to the same point in spacetime. 
Since a change in phase is indistinguishable from a spatial rotation about the spin axis of the Dirac spinor, 
operationally torsion rotates particles, whence the name torsion. 


2.19.3 No torsion 


In the remainder of this Chapter, torsion will be assumed to vanish, as general relativity postulates. A 
decomposition of the Riemann tensor into torsion-free and contortion parts is deferred to §11.18. 


2.19.4 Symmetries of the Riemann tensor 


In a locally inertial frame (necessarily, with vanishing torsion), the connection coefficients all vanish, Tapy = 0, 
but their partial derivatives, which are proportional to second derivatives of the metric tensor, equation (2.63), 
do not vanish. Thus in a locally inertial frame the Riemann tensor is 


OV, OV pve 


Reny = Ox" Ox* 
Es. ( guv O7 gu 0? gun O° aes O° Guu i OF Ging ) 
2\0r"Oa* — Ox Ox” Ox ðr” — O®Ox* — AXOx” Ax Ox"* 
1 ( Ogun A gur OF Ga + 8? qui: ) 
2\dxr"Ox" Ox" Oak AxrOx” —— Ox Ox" 


You can check that the bottom line of equation (2.116): 


(2.116) 
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is antisymmetric in K © A, 
is antisymmetric in 4 © v, 
is symmetric in KA & pv, 


Se S 


has the property that the sum of the cyclic permutations of the last three (or first three, or indeed any 
three) indices vanishes 


Redpv oF Revd F Repvr =0. (2.117) 


Actually, as shown in Exercise 11.6, the third, symmetric, symmetry is a consequence of the fourth, cyclic 
symmetry. The first three of the four symmetries can be expressed compactly 


Riruv = Riran) > (2.118) 
in which [] denotes antisymmetrization and () symmetrization, as in 
Apex) = $ (Aka — Ar) » Aga) =$ (Ara + Aan) - (2.119) 


The symmetries (2.118) imply that the Riemann tensor is a symmetric matrix of antisymmetric matrices. An 
antisymmetric tensor is also known as a bivector, much more about which you can discover in Chapter 13 
on the geometric algebra. An antisymmetric matrix, or bivector, in 4 dimensions has 6 degrees of freedom. 
A symmetric matrix of bivectors is a 6 x 6 symmetric matrix, which has 21 degrees of freedom. The final, 
cyclic symmetry of the Riemann tensor, equation (2.117), which can be abbreviated 


Reprxpv] =O, (2.120) 


removes 1 further degree of freedom. Thus the Riemann tensor has a net 20 degrees of freedom. 

Although the above symmetries were derived in a locally inertial frame, the fact that the Riemann tensor 
is a tensor means that the symmetries hold in any frame. If you prefer, you can add back the products of 
connection coefficients in equation (2.112), and check that the claimed symmetries remain. 

Some of the symmetries of the Riemann tensor persist when torsion is present, and others do not. The 
relation between symmetries of the Riemann tensor and torsion is deferred to Exercises 11.4—11.6. 


2.20 Ricci tensor, Ricci scalar 


The Ricci tensor R,,,, and Ricci scalar R are the essentially unique contractions of the Riemann curvature 
tensor. The Ricci tensor, the compressive part of the Riemann tensor, is 


Rip = g” Rean a coordinate tensor . (2.121) 


If torsion vanishes as general relativity assumes, then the Ricci tensor is symmetric 
Ri = Big (2.122) 


and therefore has 10 independent components. 
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The Ricci scalar is 


R=g'"'R,,,| a coordinate tensor (a scalar) . (2.123) 


2.21 Einstein tensor 


The Einstein tensor G’,,,, is defined by 


Gap = Rep — È gru R| a coordinate tensor . (2.124) 


For vanishing torsion, the symmetry of the Ricci and metric tensors imply that the Einstein tensor is likewise 
symmetric 


Gru Gres (2.125) 


and thus has 10 independent components. 


2.22 Bianchi identities 
The Jacobi identity 
[Dx [D> Da] + Dy Pes Ds] + [Da [Dr Da] = 0 (2.126) 
implies the Bianchi identities which, for vanishing torsion, are 
DiRypvr + DaRprve + Du Ricdrve =0 . (2.127) 


The torsion-free Bianchi identities can be written in shorthand 


Dighige= 0. (2.128) 


The Bianchi identities constitute a set of differential relations between the components of the Riemann 
tensor, which are distinct from the algebraic symmetries of the Riemann tensor. There are 4 ways to pick 
[kàu], and 6 ways to pick antisymmetric vr, giving 4 x 6 = 24 Bianchi identities, but 4 of the identities, 
Di.Rdyv\r = 0, are implied by the cyclic symmetry (2.120), which is a consequence of vanishing torsion. 
Thus there are 24—4 = 20 non-trivial torsion-free Bianchi identities on the 20 components of the torsion-free 
Riemann tensor. 


Exercise 2.14. Jacobi identity. Prove the Jacobi identity (2.126). 
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2.23 Covariant conservation of the Einstein tensor 


The most important consequence of the torsion-free Bianchi identities (2.128) is obtained from the double 
contraction 


g” g (De Figen oe Dy Ruri T DiRx dvr) = =D" Rey T DPR 4 T DR = 0 ’ (2.129) 
or equivalently 
D"Gau =0, (2.130) 


where G',,,, is the Einstein tensor, equation (2.124). Equation (2.130) is a primary motivation for the form 
of the Einstein equations, since it implies energy-momentum conservation, equation (2.132). It is worth 
remarking that the derivation of the contracted Bianchi identities (3.7) holds in arbitrarily many spacetime 
dimensions, so the factor of 4 multiplying the Ricci scalar R in the definition (2.124) of the Einstein tensor 
holds in arbitrarily many spacetime dimensions, not just 4. 


2.24 Einstein equations 


Einstein’s equations are 


Gy. = 87GT,,,,| a coordinate tensor equation . (2.131) 


What motivates the form of Einstein’s equations? 
1. The equation is generally covariant. 


2. For vanishing torsion, the Bianchi identities (2.128) guarantee covariant conservation of the Einstein 
tensor, equation (2.130), which in turn guarantees covariant conservation of energy-momentum, 


D*Tx,, = 0). (2.132) 


3. The Einstein tensor depends on the lowest (second) order derivatives of the metric tensor that do not 
vanish in a locally inertial frame. 
In Chapter 16, the Einstein equations will be derived from an action principle. Although Einstein derived his 
equations from considerations of theoretical elegance, the real justification for them is that they reproduce 
observation. 

Einstein’s equations (2.131) constitute a complete set of gravitational equations, generalizing Poisson’s 
equation of Newtonian gravity. However, Einstein’s equations by themselves do not constitute a closed set 
of equations: in general, other equations, such as Maxwell’s equations of electromagnetism, and equations 
describing the microphysics of the energy-momentum, must be adjoined to form a closed set. 


82 Fundamentals of General Relativity 


Exercise 2.15. Einstein tensor in 3 or more dimensions. What is the Einstein tensor in N > 3 
spacetime dimensions? 

Solution. The Einstein tensor must be covariantly conserved to ensure that its source, energy-momentum, 
is covariantly conserved. The doubly-contracted Bianchi identities (3.7) hold as long as there are at least 3 
spacetime dimensions. In N = 2 spacetime dimensions, there are zero Bianchi identities (2.128), since there 
are zero ways of picking 3 distinct indices. Thus the expression (2.124) for the Einstein tensor holds in any 
number N > 3 of spacetime dimensions. See §11.19 for general relativity in 2 spacetime dimensions. 


2.25 Summary of the path from metric to the energy-momentum tensor 


Start by defining the metric guv- 

Compute the connection coefficients ['),,, from equation (2.63). 

Compute the Riemann tensor R,,y,,, from equation (2.112). 

Compute the Ricci tensor R,,,, from equation (2.121), the Ricci scalar R from equation (2.123), and the 
Einstein tensor G,,,, from equation (2.124). 


ee ee ae 


5. The Einstein equations (2.131) then imply the energy-momentum tensor T,,,,. 

The path from metric to energy-momentum tensor is straightforward to program on a computer, but 
the results are typically messy and complicated, even for fairly simple spacetimes. Inverting the path to 
recover the metric from a given energy-momentum content is typically highly non-trivial, the subject of a 
vast literature. 

The great majority of metrics g,,, yield an energy-momentum tensor Tk, that cannot be achieved with 


normal matter. 


2.26 Energy-momentum tensor of a perfect fluid 


The simplest non-trivial energy-momentum tensor is that of a perfect fluid. In this case T”” is taken to be 
isotropic in the locally inertial rest frame of the fluid, taking the form 


p 0 0 0 
0 p 0 0 
T = 2.1 
0 0 p O ata) 
0 0 0 p 
where 
p is the proper mass-energy density , (2.134) 


p is the proper pressure . 


2.27 Newtonian limit 83 


The expression (2.133) is valid only in the locally inertial rest frame of the fluid. An expression that is valid 


in any frame is 


TH” = (p + p)ju”u” + pg" , (2.135) 


where u” is the 4-velocity of the fluid. Equation (2.135) is valid because it is a tensor equation, and it is true 
in the locally inertial rest frame, where u” = {1,0,0,0}. 


2.27 Newtonian limit 


The Newtonian limit is obtained in the limit of a weak gravitational field and non-relativistic (pressureless 
matter. In Cartesian coordinates, the metric in the Newtonian limit is (see Chapter 27) 
ds? = — (1 + 26)dt? + (1 — 28) (dx? + dy? + dz’) , (2.136 
in which 
(x,y,z) = Newtonian potential (2.137 


is a function only of the spatial coordinates x, y, z, not of time t. 
For this metric, to first order in the potential ® the only non-vanishing component of the Einstein tensor 
is the time-time component 


Gu = 2V76, (2.138) 


where V? = 07/02? +0? /Oy? +0? /0z? is the usual 3-dimensional Laplacian operator. This component (2.138) 
of the Einstein tensor plugged into Einstein’s equations (2.131) implies Poisson’s equation (2.4). 


Exercise 2.16. Special and general relativistic corrections for clocks on satellites. The metric just 
above the surface of the Earth is well-approximated by 


ds? = — (1+ 26)dt? + (1 — 26)dr? + r?(d6? + sin?6 dd) , (2.139) 
where 
@(r) =— es (2.140) 
F 


is the familiar Newtonian gravitational potential. 

1. Proper time. Consider an object at fixed radius r, moving along the equator 0 = 7/2 with constant 
non-relativistic velocity r dọ /dt = v. Compare the proper time of this object with that at rest at infinity. 
[Hint: Work to first order in the potential ©. Regard v? as first order in ®. Why is that reasonable?] 

2. Orbits. Consider a satellite in orbit about the Earth. The conservation of energy E per unit mass, 
angular momentum L per unit mass, and rest mass per unit mass are expressed by (§4.8) 


u=—-E, up=L, upu” =-l1. (2.141) 
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For equatorial orbits, 0 = 7/2, show that the radial component u” of the 4-velocity satisfies 


u" = /2(AE—U), (2.142) 


where AF is the energy per unit mass of the particle excluding its rest mass energy, 
AE=E-1, (2.143) 


and the effective potential U is 


2 


U=@+ Á (2.144) 
2r2 ` 7 


[Hint: Neglect air resistance. Remember to work to first order in ®. Treat AE and L? as first order in 
®. Why is that reasonable?] 

3. Circular orbits. From the condition that the potential U be an extremum, find the circular orbital 
velocity v = r dġ/dt of a satellite at radius r. 

4. Special and general relativistic corrections for satellites. Compare the proper time of a satellite 
in circular orbit to that of a person at rest at infinity. Express your answer in the form 


dTsatellite 


dt 
where far and fsp are the general relativistic and special relativistic corrections, and ®g is the dimen- 
sionless gravitational potential at the surface of the Earth, 


GMa 
CRe - 


1=—® 9 (far + fsr) , (2.145) 


ðo = (2.146) 


What is the value of ®g in milliseconds per year? 

5. Special and general relativistic corrections for satellites vs. Earth observer. Compare the 
proper time of a satellite in circular orbit to that of a person on Earth at one of the poles (so the person 
has no motion from the Earth’s rotation). Express your answer in the form 


dTsatellite AT person = 


dt dt 


At what satellite radius r, in units of Earth radius Rẹ, do the special and general relativistic corrections 


Po (far + fsr) - (2.147) 


cancel? 

6. Special and general relativistic corrections for ISS and GPS satellites. What are the corrections 
(be careful to get the sign right!) in units of ®g, and in units of msyr™t, for (i) a satellite in low Earth 
orbit, such as the International Space Station; (ii) a nearly geostationary satellite, such as a GPS 
satellite? Google the numbers that you may need. 


Exercise 2.17. Equations of motion in weak gravity. Take the metric to be the Newtonian met- 
ric (2.136) with the Newtonian potential ®(x, y,z) a function only of the spatial coordinates x, y, z, not of 
time t, equation (2.137). 
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1. Confirm that the non-zero connection coefficients are (coefficients as below but with the last two indices 
swapped are the same by the no-torsion condition I, = T5) 


ow 
Ox” 


Dia Th B6 Tha Tia (a = p=x,y, z) ‘ (2.148) 


[Hint: Work to linear order in ©.| 
2. Consider a massive, non-relativistic particle moving with 4-velocity u” = da“/dr = {ut u”, u”, u7}. 
Show that u,,u = —1 implies that 


1 
ub =1+ su —o, (2.149) 
whereas 


1 
w= (1 + xt + 2) (2.150) 
where u = [(u”)? + (u”)? + (u7). One of u* or u; is constant. Which one? [Hint: Work to linear 
order in ®. Note that u? is of linear order in ®.] 


3. Equation of motion of a massive particle. From the geodesic equation 


a +r u'u =0 (2.151) 
7 
show that 

du O® 

We ~~ pga QAQ=T72,Y,2. (2.152) 


Why is it legitimate to replace dr by dt? Show further that 


t 

a = —2u° 22 (2.153) 

with implicit summation over a = x,y,z. Does the result agree with what you would expect from 
equation (2.149)? 

4. For a massless particle, the proper time along a geodesic is zero, and the affine parameter A must be 
used instead of the proper time. The 4-velocity of a massless particle can be defined to be (and really 
this is just the 4-momentum p” up to an arbitrary overall factor) v” = dx /d\ = {v', v”, v”, v7}. Show 
that v„v” = 0 implies that 


v = (1-20), (2.154) 


whereas 


U=-v, (2.155) 


where v = [(v")? + (v¥)? 4 (v2)2]7?. One of v' or vs is constant. Which one? 
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5. Equation of motion of a massless particle. From the geodesic equation 


du" K La VY 
D + Tipo" v =0 (2.156) 
show that the spatial components v = {v”, v”, v” } satisfy 
d 
S = 2v x (v x VÒ) , (2.157) 


where boldface symbols represent 3D vectors, and in particular V® is the spatial 3D gradient V® = 
0®/dx* = {08/dx, O®/dy, O®/dz}. 

Interpret your answer, equation (2.157). In what ways does this equation for the acceleration of photons 
differ from the equation governing the acceleration of massive particles? [Hint: Without loss of generality, 
the affine parameter can be normalized so that the photon speed is one, v = 1, so that v is a unit vector 
representing the direction of the photon.] 

Consider an observer who happens to be at rest in the Newtonian metric, so that u” = u” = u? = 0. 
Argue that the energy of a photon observed by this observer, relative to an observer at rest at zero 
potential, is 


—uMy, =1-©. (2.158) 


Does the observed photon have higher or lower energy in a deeper potential well? 


Exercise 2.18. Deflection of light by the Sun. 
1. Consider light that passes by a spherical mass M sufficiently far away that the potential ® is always 


weak. The potential at distance r from the spherical mass can be approximated by the Newtonian 
potential 


M 
ð=- a (2.159) 


Approximate the unperturbed path of light past the mass as a straight line. The plan is to calculate 
the deflection as a perturbation to the straight line (physicists call this the Born approximation). For 
definiteness, take the light to be moving in the z-direction, offset by a constant amount y away from 
the mass in the y-direction (so y is the impact parameter, or periapsis). Argue that equation (2.157) 
becomes 


du” du” O® 
=v" = —2(v")? — . 2.1 
dA "d (w) Oy (aten 
Integrate this equation to show that 
Av! 4GM 
“ == a . (2.161) 
ve Y 


Argue that this equals the deflection angle Ad. 


2. Calculate the predicted deflection angle Aġ¢ in arcseconds for light that just grazes the limb of the Sun. 
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Exercise 2.19. Shapiro time delay. The three classic tests of general relativity are the gravitational 
redshift (Exercise 2.9), the gravitational bending of light around the Sun (Exercise 2.18), and the precession 
of Mercury (Exercise 7.9). Shapiro (1964) pointed out a fourth test, that the round-trip time for a light 
beam bounced off a planet or spacecraft would be lengthened slightly by the passage of the light through 
the gravitational potential of the Sun. The experiment could be done with radio signals, since the Sun does 
not overwhelm a radio signal passing near its limb. In Exercise 2.17 you showed that the time component of 
the 4-velocity v” = dx” /dA of a massless particle moving through a weak gravitational potential ® is (units 
c=1) 

u = ETI = {v', v} = {1 — 29, v} , (2.162) 
where v is a 3-vector of unit magnitude. Equation (2.162) implies that 


dt 
—=1-26 2.163 
o (2.163) 


where dl = |dx| is the magnitude of the 3-vector interval dx. The Shapiro time delay comes from the 2® 
correction. 


Figure 2.6 A person on Earth sends out a radio signal that passes by the Sun, bounces off the planet Venus, and 
returns to Earth. 


1. Time delay. The potential ® at distance r from the Sun is 
GMo 


fa 


= 


(2.164) 


Assume that the path of the light can be well-approximated as a straight line, as illustrated in Figure 2.6. 
Show that the round-trip time At is, with units of c restored, 
2 4GMo (rp + ln) (rv + ly) 
5 In 


At = tly) 4 
„(lE v) F b2 , 


(2.165) 


where, as illustrated in Figure 2.6, rg and ry are the distances of Earth and Venus from the Sun, b is the 
impact parameter, and lg and ly are the distances of Earth and Venus from the point of closest approach. 
The first term in equation (2.165) is the Newtonian expectation, while the last term in equation (2.165) 
is the Shapiro term. 
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2. Shapiro time delay for the Earth-Venus-Sun system. Evaluate the Shapiro time delay, in mil- 


liseconds, for the Earth-Venus-Sun system when the radio signal just grazes the limb of the Sun, 
with b = Ro. |Hint: The Earth-Sun distance is rg = 1.496 x 10!!m, while the Venus-Sun distance 
is ry = 1.082 x 101! m] 


Change in the time delay as the planets orbit. Assume that Earth and Venus are in circular orbit 
about the Sun (so rg and ry are constant). What are the derivatives dlg /db and dly/db, in terms of lg, 
ly, and b? Deduce an expression for cdAt/db. Identify which is the Newtonian contribution, and which 
the Shapiro contribution. Among the terms in the Shapiro contribution, which one term dominates for 
small impact parameters, where b< rg and b < ry? 


. Relative sizes of Newtonian and Shapiro terms. From your results in part (c), calculate approx- 


imately the relative sizes of the Newtonian and Shapiro contributions to the variation cdAt/db of the 
time delay when the radio signal just grazes the limb of the Sun, b = Re. Comment. 


Exercise 2.20. Gravitational lensing. In Exercise 2.18 you found that, in the weak field limit, light 
passing a spherical mass M at impact parameter y is deflected by angle 


_ 4GM 
Sat 


Ad (2.166) 


1. Lensing equation. Argue that the deflection angle Ad is related to the angles a and £ illustrated in 


Image 


A 


YA 


Observer 


< > 


Figure 2.7 Lensing diagram. 


the lensing diagram in Figure 2.7 by 


aDs = BDg + AdDysg - (2.167) 
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Image 


Source 


Lens 


2nd 
image 


Figure 2.8 The appearance of a source lensed by a point lens. The lens in this case is a black hole, whose physical size 
is the filled circle, and whose apparent (lensed) size is the surrounding unfilled circle. However, any mass, not just a 
black hole, will lens a background source. 


Hence or otherwise obtain the “lensing equation” in the form commonly used by astronomers 


B=a-—=, (2.168) 


where 


qoa gas 


. Solutions. Equation (2.168) has two solutions for the apparent angles a in terms of 3. What are they? 
Sketch both solutions on a lensing diagram similar to Figure 2.7. 
. Magnification. Figure 2.8 illustrates the appearance of a finite-sized source lensed by a point gravita- 
tional lens. If the source is far from the lens, then the source redshift is unchanged by the gravitational 
lensing. But the distortion changes the apparent brightness of the source by a magnification u equal to 
the ratio of the apparent area of the lensed source to that of the unlensed source. For a small source, 
the ratio of areas is 
ya dya 
u= ; 
ys dys 
What is the magnification of a small source in terms of a and ag? When is the magnification largest? 
. Einstein ring around the Sun? The case a = ag evidently corresponds to the case where the source 
is exactly behind the lens, 6 = 0. In this case the lensed source appears as an “Einstein ring” of light 
around the lens. Could there be an Einstein ring around the Sun, as seen from Earth? 
. Einstein ring around Sgr A*. What is the maximum possible angular size of an Einstein ring around 
the 4 x 10® Mo black hole at the center of our Milky Way, 8kpc away? Might this be observable? 


(2.170) 
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More on the coordinate approach 


3.1 Weyl tensor 


The trace-free, or tidal, part of the Riemann curvature tensor defines the Weyl tensor Ckypuv 


C khv = Reràpy = 5 (gru Rav = grv Arp + Dv Rep = GpRev) + é (Gre Jdv = Gv Gru) R| a coordinate tensor . 


(3.1) 
The Weyl tensor is by construction trace-free, meaning that it vanishes on contraction of any two indices, 


which is true with or without torsion. 

If torsion vanishes as general relativity assumes, then the Weyl tensor has 10 independent components, 
which together with the 10 components of the Ricci tensor account for the 20 distinct components of the 
Riemann tensor. The Weyl tensor C’,,,,, inherits the symmetries (2.118) of the Riemann tensor, which for 
vanishing torsion are 


Crau = Ce rJfyr)) + (3.2) 


Whereas the Einstein tensor G,,,, necessarily vanishes in a region of spacetime where there is no energy- 
momentum, Tk, = 0, the Weyl tensor does not. The Weyl tensor expresses the presence of tidal gravitational 
forces, and of gravitational waves. 

If torsion does not vanish, then the Weyl tensor has 20 independent components, which together with the 
16 components of the Ricci tensor account for the 36 distinct components of the Riemann tensor with torsion. 
The 6 antisymmetric components Giu] of the Einstein tensor vanish if torsion vanishes, and likewise the 10 
antisymmetric components C{y,,\j{,,,]] of the Weyl tensor vanish if torsion vanishes. With or without torsion, 
the 10 symmetric components C(,)jf.]) of the Weyl tensor encode gravitational waves that propagate in 
empty space. 


Exercise 3.1. Weyl tensor in arbitrary dimensions. What is the Weyl tensor in N spacetime dimen- 
sions? 


Solution. The Weyl tensor is the trace-free part of the Riemann tensor. In N spacetime dimensions it is 
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given by the same expression (3.1) but with different coefficients, 


1 1 


Ceduv = Riedy = N2 (gru Rav = Gv Rr + Dv Rep = Jau Rkv) + ) (Gree Jdv — Gv Gr) R . 


(N—1)(N—2 
(3.3) 


The Weyl tensor vanishes identically in N = 2 and 3 spacetime dimensions. 


Exercise 3.2. Number of components of the Riemann, Ricci, and Weyl tensors in arbitrary 
dimensions. How many components do the Riemann, Ricci, and Weyl tensors have in N spacetime dimen- 
sions? 

Solution. The number of components depends on the total number N of spacetime dimensions, regardless 
of how many of those dimensions are timelike or spacelike. With torsion, the Riemann tensor is a matrix of 
bivectors. If torsion vanishes, the cyclic symmetry (2.120) imposes 4.N?(N—1)(N—2) conditions. Thus the 
number of components components of the Riemann tensor with and without torsion is 


Riemann torsion-full: (4N(N — 1))° ; (3.4a) 
Riemann torsion-free: (N +1)N?°(N — 1). (3.4b) 


The Ricci tensor is the trace-full part of the Riemann tensor. In N > 3 spacetime dimensions, the Ricci 
tensor with torsion is a matrix of vectors, and without torsion is a symmetric matrix of vectors. Thus the 
number of components of the Ricci tensor with and without torsion is 
Ricci torsion-ful: N? , (3.5a) 
Ricci torsion-free: $(N+1)N . (3.5b) 
The Weyl tensor is the trace-free part of the Riemann tensor. The number of Weyl components is the 
difference between the number of Riemann and Ricci components, which with and without torsion is, in 
N > 3 spacetime dimensions, 
Weyl torsion-full: 4(N + 1)N?(N — 3) , (3.6a) 
Weyl torsion-free: (N+ 2)(N +1)N(N — 3). (3.6b) 
Equations (3.5) and (3.6) hold only for N > 3. For N = 2, the Riemann tensor has 1 component, the Ricci 
tensor 1 component, and the Weyl tensor 0 components, equation (11.92). 


3.2 Evolution equations for the Weyl tensor, and gravitational waves 


This section shows how the evolution equations for the Weyl tensor resemble Maxwell’s equations for the 
electromagnetic field, and how the Weyl tensor encodes gravitational waves. In this section, torsion is taken 
to vanish, as general relativity assumes. 

Contracted on one index, the torsion-free Bianchi identities (2.127) are 


DiRi” = Di Rap” + Dy Rw — D Rw =0. (3.7) 
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In 4-dimensional spacetime, there are 20 such independent contracted identities, consisting of 4 trace iden- 
tities obtained by contracting over Av, and 16 trace-free identities. Since this is the same as the number of 
independent torsion-free Bianchi identities, it follows that the contracted Bianchi identities (3.7) are equiva- 
lent to the full set of Bianchi identities (2.128). An explicit expression for the Bianchi identities in terms of 
the contracted Bianchi identities is, in 4-dimensions (in 5 or higher dimensions there are additional terms), 

DieRyyy” = (18 bp dg doa + 9 6707.55 a) DwRpo” (4D spacetime) . (3.8) 


If the Riemann tensor is separated into its trace (Ricci) and traceless (Weyl) parts, equation (3.1), then the 
contracted Bianchi identities (3.7) become the Weyl evolution equations 


D°Cyypv = J Aas (3.9) 
where J),,, is the Weyl current 
Japp = 4 (DiGyv = D,G,) = i (gx Du G > Grn DG) : (3.10) 


The Weyl] evolution equations (3.9) can be regarded as the gravitational analogue of Maxwell’s equations of 
electromagnetism. 

The Weyl current J),,, is a vector of bivectors, which would suggest that it has 4 x 6 = 24 components, 
but it loses 4 of those components because of the cyclic identity (2.117), valid for vanishing torsion, which 
implies the cyclic symmetry 


Jau =0. (3.11) 


Thus the torsion-free Weyl current J),,, has 20 independent components, in agreement with the above 
assertion that there are 20 independent torsion-free contracted Bianchi identities. Since the Weyl tensor is 
traceless, contracting the Weyl evolution equations (3.9) on Aju yields zero on the left hand side, so that the 
contracted Weyl current satisfies 


ry, =0. (3.12) 


This doubly-contracted Bianchi identity, which is the same as equation (2.130), enforces conservation of 
energy-momentum. Unlike the cyclic symmetry (3.11), which follows from the cyclic symmetry of the Rie- 
mann tensor and is not a differential condition on the Riemann tensor, equations (3.12) constitute a non- 
trivial set of 4 differential conditions on the Einstein tensor. Besides the algebraic relations (3.11) and (3.12), 
the Weyl current satisfies 6 differential identities comprising the conservation law 


IP Teg =0 (3.13) 


in view of equation (3.9) and the antisymmetry of Ckyuv with respect to the indices KA. The Weyl current 
conservation law (3.13) follows from the form (3.10) of the Weyl current, coupled with covariant conservation 
of the Einstein tensor, equation (2.130), so does not impose any additional non-trivial conditions on the 
Riemann tensor. The Weyl current conservation law (3.13) is the gravitational analogue of the conservation 
law for electric current that follows from Maxwell’s equations. 

Whereas the Einstein equations relating the Einstein tensor to the energy-momentum tensor are postulated 
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equations of general relativity, the evolution equations (3.9) for the Weyl tensor, and the equations enforcing 
covariant conservation of the Einstein tensor, follow mathematically from the Bianchi identities, and do not 
represent additional assumptions of the theory. 


Exercise 3.3. Number of Bianchi identities. Confirm the counting of degrees of freedom. 


Exercise 3.4. Wave equation for the Riemann and Weyl tensors. From the torsion-free Bianchi 
identities (2.128) and (3.7), show that the torsion-free Riemann tensor satisfies the covariant wave equation 


Rripe = DeD p Riv -— De DIR + DD Rey ~ DyDiRev , (3.14) 


where O is the D’Alembertian operator, the 4-dimensional wave operator 


= D" D, . (3.15) 


Show that contracting equation (3.14) with gò yields the identity ORs, = OR,,,,. Conclude that the wave 
equation (3.14) is non-trivial only for the trace-free part of the Riemann tensor, the Weyl tensor Ckyuv- 
Show that the wave equation for the Weyl tensor is 


Carn — (D,.Dy m = gru )Ryv = (D:D, = 5 gw )Ray 
Ey (DxD, E igw Ric = (DaD, = + gau Rev 
T z ( Grain Xue = Gini) R. (3.16) 


Conclude that in a vacuum, where R,,,, = 0, 


Cup =0. (3.17) 


3.3 Geodesic deviation 


This section on geodesic deviation is included not because the equation of geodesic deviation is crucial to 
everyday calculations in general relativity, but rather for two reasons. First, the equation offers insight into 
the physical meaning of the Riemann tensor. Second, the derivation of the equation offers a fine illustration 
of the fact that in general relativity, whenever you take differences at infinitesimally separated points in 
space or time, you should always take covariant. differences. 

Consider two objects that are free-falling along two infinitesimally separated geodesics. In flat space the 
acceleration between the two objects would be zero, but in curved space the curvature induces a finite 
acceleration between the two objects. This is how an observer can measure curvature, at least in principle: 
set up an ensemble of objects initially at rest a small distance away from the observer in the observer’s 
locally inertial frame, and watch how the objects begin to move. The equation (3.24) that describes this 
acceleration between objects an infinitesimal distance apart is called the equation of geodesic deviation. 
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The covariant difference in the velocities of two objects an infinitesimal distance da" apart is 


Dôr” 
Dr 


= ĝu" (3.18) 


In general relativity, the ordinary difference between vectors at two points a small interval apart is not 
a physically meaningful thing, because the frames of reference at the two points are different. The only 
physically meaningful difference is the covariant difference, which is the difference in the two vectors parallel- 
transported across the gap between them. It is only this covariant difference that is independent of the frame 
of reference. On the left hand side of equation (3.18), the proper time derivative must be the covariant proper 
time derivative, D/Dr = uòD,. On the right hand side of equation (3.18), the difference in the 4-velocity 
at two points 6x" apart must be the covariant difference 6 = dx" D,,. Thus equation (3.18) means explicitly 
the covariant equation 


uò Dôr” = z" Dyu” . (3.19) 


To derive the equation of geodesic deviation, first vary the geodesic equation Du,,/Dr = 0 (the index p is 
put downstairs so that the final equation (3.24) looks cosmetically better, but of course since everything is 
covariant the u index could just as well be put upstairs everywhere): 


Du 
0 = 5+ 
Dr 
= 6x" Dp (u*D\u,) 
= 6u*Dyu, + 62"uD, Duy - (3.20) 


On the second line, the covariant difference 6 between quantities a small distance dx“ apart has been set 
equal to ôx“ Dp, while D/Dr has been set equal to the covariant time derivative u*D) along the geodesic. 
On the last line, ôx“ Duò has been replaced by ĝu”. Next, consider the covariant acceleration of the interval 
6x,,, which is the covariant proper time derivative of the covariant velocity difference ðu: 


Doz, _ Dou, 
Dr? Dr 
= uò D, (x° Deun) 
= ĝu" Dgup + öx u` Dy Drup : (3.21 


As in the previous equation (3.20), on the second line D/Dr has been set equal to uòàD,, while 5 has been 
set equal to dx" D,.. On the last line, u*D)éx" has been set equal to ĝu”, equation (3.19). Subtracting (3.20 
from (3.21) gives 
D? ôx, 
Dr? 


= ôz"uò[Dy, Dalu 5 (3.22 


or equivalently 
2 
D*ôx, 
Dr? 


+ S” ôx uò Dru, + Rey pde'wu’ =0. 3.23 
KA H H 
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If torsion vanishes as general relativity assumes, then 


diy 
Dr? 


+ Reda uu” =0 


which is the desired equation of geodesic deviation. 
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(3.24) 
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Action principle for point particles 


This Chapter describes the action principle for point particles in a prescribed gravitational field. The action 
principle provides a powerful way to obtain equations of motion for particles in a given spacetime, such as 
a black hole, or a cosmological spacetime. An action principle for the gravitational field itself is deferred to 
Chapter 16, after development of the tetrad formalism in Chapter 11. 

Hamilton’s principle of least action postulates that any dynamical system is characterized by a scalar 
action S, which has the property that when the system evolves from one specified state to another, the path 
by which it gets between the two states is such as to minimize the action. The action need not be a global 
minimum, just a local minimum with respect to small variations in the path between fixed initial and final 
states. 

That nature appears to respect a principle of such simplicity and power is quite remarkable, and a deep 
mystery. But it works, and in modern physics, the principle of least action has become a basic building block 
with which physicists construct theories. 

From a practical perspective, the principle of least action, in either Lagrangian or Hamiltonian form, 
provides the most powerful way to solve equations of motion. For example, integrals of motion associated 
with symmetries of the spacetime emerge automatically in the Lagrangian or Hamiltonian formalisms. 


4.1 Principle of least action for point particles 


The path of a point particle through spacetime is specified by its coordinates «(A) as a function of some 
arbitrary parameter À. In non-relativistic mechanics it is usual to take the parameter A to be the time t, and 
the path of a particle through space is then specified by three spatial coordinates x“ (t). In relativity however 
it is more natural to treat the time and space coordinates on an equal footing, and to regard the path of a 
particle as being specified by four spacetime coordinates x” (A) as a function of an arbitrary parameter À, as 
illustrated in Figure 4.1. The parameter À is simply a differentiable parameter that labels points along the 
path, and has no physical significance (for example, it is not necessarily an affine parameter). 

The path of a system of N point particles through spacetime is specified by 4N coordinates æ” (A). The 
action principle postulates that, for a system of N point particles, the action S is an integral of a Lagrangian 
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i 


x! 


Figure 4.1 The action principle considers various paths through spacetime between fixed initial and final conditions, 


and chooses that path that minimizes the action. 


L(x”, dx" /dX) which is a function of the 4N coordinates x” (A) together with the 4N velocities dz” /dà with 
respect to the arbitrary parameter A. The action from an initial state at A; to a final state at Aç is thus 


Af H 
S= (2!) aa). (4.1) 


Ài 


The principle of least action demands that the actual path taken by the system between given initial and 
final coordinates z} and x} is such as to minimize the action. Thus the variation 6S' of the action must be 
zero under any change ôx” in the path, subject to the constraint that the coordinates at the endpoints are 
fixed, dai’ = 0 and rf = 0, 


ou aL 
= by H = 
6S = f (Za H samaa A /a)) dA=0. (4.2) 
Linearity of the derivative, 
d da" d(x”) 
(gh wy oN 
T (a + ôx") T D’ (4.3) 


shows that the change in the velocity along the path equals the velocity of the change, 6(dx"/dX) = 
d(x” )/dà. Integrating the second term in the integrand of equation (4.2) by parts yields 


aL N `/Ə L d ðL 
= |— ~ fr” td\=0. 4.4 
65 P T L i f. & dA aden) a : oe 


The surface term in equation (4.4) vanishes, since by hypothesis the coordinates are held fixed at the 
endpoints, so 62“ = 0 at the endpoints. Therefore the integral in equation (4.4) must vanish. Indeed least 
action requires the integral to vanish for all possible variations 6x“ in the path. The only way this can happen 
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is that the integrand must be identically zero. The result is the Euler-Lagrange equations of motion 


d OL OL 
dà O(dz"/ddA) Ox 


=0|. (4.5) 


It might seem that the Euler-Lagrange equations (4.5) are inadequately specified, since they depend on 
some arbitrary unknown parameter A. But in fact the Euler-Lagrange equations are the same regardless of 
the choice of A. An example of the arbitrariness of A will be seen in §4.3. Since À can be chosen arbitrarily, 
it is common to choose it in some convenient fashion. For a massive particle, À can be taken equal to the 
proper time 7 of the particle. For a massless particle, whose proper time never progresses, À can be taken 
equal to an affine parameter. 


Concept question 4.1. Redundant time coordinates? How can it be possible to treat the time co- 
ordinate t for each particle as an independent coordinate? Isn’t the time coordinate t the same for all N 
particles? Answer. Different particles follow different trajectories in spacetime. One is free to choose t(A) 
to be a different function of the parameter A for each particle, in the same way that the spatial coordinate 
x° (A) may be a different function for each particle. 


4.2 Generalized momentum 


The left hand side of the Euler-Lagrange equations of motion (4.5) involves the partial derivative of the 
Lagrangian with respect to the velocity dx” /dX. This quantity plays a fundamental role in the Hamiltonian 
formulation of the action principle, §4.10, and is called the generalized momentum 7,, conjugate to the 
coordinate x", 


Di OL 
Th = Oda" Jan) |" (4.6) 


4.3 Lagrangian for a test particle 


According to the principle of equivalence, a test particle in a gravitating system moves along a geodesic, a 
straight line relative to local free-falling frames. A geodesic is the shortest distance between two points. In 
relativity this translates, for a massive particle, into the longest proper time between two points. The proper 
time along any path is dr = /—ds? = V—gurdz”dz”. Thus the action Sm of a test particle of constant rest 
mass m in a gravitating system is 


At At dx! dx” 
— d= a n a, 4. 
8 mf T mf Suv ao (4.7) 
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The factor of rest mass m brings the action, which has units of angular momentum, to standard normalization. 
The overall minus sign comes from the fact that the action is a minimum whereas the proper time is a 
maximum along the path. The action principle requires that the Lagrangian L(a“, da /dA) be written as a 
function of the coordinates x” and velocities dx“ /dX, and it is seen that the integrand in the last expression 
of equation (4.7) has the desired form, the metric g,,, being considered a given function of the coordinates. 
Thus the Lagrangian Lm of a test particle of mass m is 


dz” dx” 
Lm = —™A4| Ipv Dax | (4.8) 


The partial derivatives that go in the Euler-Lagrange equations (4.5) are then 


dx” 
OLm Iw IA 
= m 5 
A(dx" /dX) V—9np (dat JAN) (dx? /dX) 
1 Ogu, dz” da” 
te S. 2 dx dd dà (4.9b) 
Ox" V—Grp(da* /dA) (dx? /dd) 


(4.9a) 


The denominators in the expressions (4.9) for the partial derivatives of the Lagrangian are 
V/—9np(dx™ /dd) (dx? /dX) = dr/dX. It was not legitimate to make this substitution before taking the partial 
derivatives, since the Euler-Lagrange equations require that the Lagrangian be expressed in terms of x” and 
dx” /dX, but it is fine to make the substitution now that the partial derivatives have been obtained. The 
partial derivatives (4.9) thus simplify to 


OLm = dx” dX _ 
Aaa a a ee) 
OLm 1 Oguv dz” dx” dr dr 
jae Gow dA Od O "V" cet) 


in which u” = dx" /dr is the usual 4-velocity, and the derivative of the metric has been replaced by connections 
in accordance with equation (2.59). The generalized momentum 7,, equation (4.6), of the test particle 
coincides with its ordinary momentum p,: 


Tk = Pk = MUk . (4.11) 
The resulting Euler-Lagrange equations of motion (4.5) are 


dmu, „dT 
T = M uvu" u Do (4.12) 


As remarked in §4.1, the choice of the arbitrary parameter À has no effect on the equations of motion. With 
a factor of m dr /dà cancelled, equation (4.12) becomes 


dup 
dt 


= Dav cur . (4.13) 
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Splitting the connection I°,,,,,, into its torsion-free part or and the contortion K,,,,,, equation (2.64), gives 


du, A o 
ae = (Tik + Kio Uru =T vu U , (4.14) 
where the last step follows from the symmetry of the torsion-free connection ere in its last two indices, 
and the antisymmetry of the contortion tensor K,,,,, in its first two indices. With or without torsion, equa- 


tion (4.14) yields the torsion-free geodesic equation of motion, 


Du; 


ao =O]. (4.15) 


Equation (4.15) shows that presence of torsion does not affect the geodesic motion of particles. 


Concept question 4.2. Throw a clock up in the air. 

1. This question is posed by Rovelli (2007). Standing on the surface of the Earth, you throw a clock up in 
the air, and catch it. Which clock shows more time elapsed, the one you threw up in the air, or the one 
on your wrist? 

2. Suppose you throw the clock so hard that it goes around the Moon. Which clock shows more time 
elapsed? 


4.4 Massless test particle 


The equation of motion for a massless test particle is obtained from that for a massive particle in the limit of 
zero mass, m — 0. The proper time 7 along the path of a massless particle is zero, but an affine parameter 
À = T/m proportional to proper time can be defined, equation (2.93), which remains finite in the limit 
m — 0. In terms of the affine parameter A, the momentum p” of a particle can be written 


f c ug” 
f = mu" = — 4.16 
pam- T, (4.16) 
and the equation of motion (4.15) becomes 
Dp, 
=0 4.17 
Ps o, (4.17) 


which works for massless as well as massive particles. 
The action for a test particle in terms of the affine parameter A defined by equation (2.93) is 


s=? | a, (4.18) 


which vanishes for m — 0. One might be worried that the action seemingly vanishes identically for a massless 
particle. An alternative nice action is given below, equation (4.30), that vanishes in the massless limit only 
after the equations of motion are imposed. 
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Concept question 4.3. Conventional Lagrangian. In the conventional Lagrangian approach, the pa- 
rameter À is set equal to the time coordinate t, and the Lagrangian L(t, x“, dx“ /dt) of a system of N particles 
is considered to be a function of the time t, the 3N spatial coordinates x“, and the 3N spatial velocities 
dx /dt. Compare the conventional and covariant Lagrangian approaches for a point particle. Answer. The 
Euler-Lagrange equations in the conventional Lagrangian approach are 


d ƏL ðL 
dt A(dx*/dt) Ər” 


=i (4.19) 


For a point particle, the Euler-Lagrange equations (4.19) yield the spatial components of the geodesic equa- 
tion of motion (4.17), 
Dpa 
DA 
What about the time component of the geodesic equation of motion? The geodesic equation for the time 


=0. (4.20) 


component is a consequence of the geodesic equations for the spatial components, coupled with conservation 
of rest mass m, 


o Dpo = 1 Dp po _ 1 D(p°pa + m?) T p° =0 (4 21) 
DA 2 DA 2 DA DA l f 
Put another way, the covariant Lagrangian approach applied to a point particle enforces conservation of the 


P 


rest mass m of the particle, a conservation law that the conventional Lagrangian approach simply assumes. 
Invariance of the action with respect to reparametrization of À implies conservation of rest mass. 


4.5 Effective Lagrangian for a test particle 


A drawback of the test particle Lagrangian (4.8) is that it involves a square root. This proves to be problematic 
for various reasons, among which is that it is an obstacle to deriving a satisfactory super-Hamiltonian, §4.12. 
This section describes an alternative approach that gets rid of the square root, making the test particle 
Lagrangian quadratic in velocities dæ” /dA, equation (4.25). 

After equations of motion are imposed, the Lagrangian (4.8) for a test particle of constant rest mass m is 


dt 
Lm =—-—m—. 4.22 
manne (4.22) 

If the parameter A is chosen such that dr/d is constant, 
d 
ih = constant , (4.23) 
so that the Lagrangian Lm is constant after equations of motion are imposed, then the Euler-Lagrange 
equations of motion (4.5) are unchanged if the Lagrangian is replaced by any function of it, 


Lm = f(Em) - (4.24) 
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A convenient choice of alternative Lagrangian L!„, also called an effective Lagrangian, is 


O L 1 det dx” 


L = ae, 4.25 
m Bm? 29” dd aA em 
For the effective Lagrangian (4.25), the partial derivatives (4.9) are 
OL} dx” 
ae ea 4.26 
alda" JAN IAA eee) 
OL! 1 Ogu, dz” dz” dx" dx” 
m = DI uk —— ——. 4.26b 
Ox® 2 Ox® dX dà EYE dy dd ( ) 
The Euler-Lagrange equations of motion (4.5) are then 
d dx” dx" dx” 
EEN kon | — De So a : 4.27 
dX (o dÀ ) RIA dÀ an) 


Equations (4.27) are valid subject to the condition (4.23), which asserts that dà « dr. The constant of 
proportionality does not affect the equations of motion (4.27), which thus reproduce the earlier equations of 
motion in either of the forms (4.15) or (4.17). 

If the test particle is moving in a prescribed gravitational field and there are no other fields, then the 
equations of motion are unchanged by the normalization of the effective Lagrangian L/,,. But if there are other 
fields that affect the particle’s motion, such as an electromagnetic field, §4.7, then the effective Lagrangian 
Li, must be normalized correctly if it is to continue to recover the correct equations of motion. The correct 
normalization is such that the generalized momentum of the test particle, defined by equation (4.26a), equal 
its ordinary momentum p,,, in agreement with equation (4.11), 


dx” dx” 


KV = Pk = Jkv G 4.28 
Inv Fy = Pr = Ie M— (4.28) 
This requires that the constant in equation (4.23) must equal the rest mass m, 

dt 

—=™m. 4.29 

D7” (4.29) 


This is just the definition of the affine parameter A, equation (2.93). Thus the A in the definition (4.25) of 
the effective Lagrangian L’, should be interpreted as the affine parameter. 

Notice that the value of the effective Lagrangian L’, after condition (4.29) is applied (after equations of 
motion are imposed) is —m?/2, which is half the value of the original Lagrangian Lm (4.8). 


4.6 Nice Lagrangian for a test particle 


The effective Lagrangian (4.25) has the advantage that it does not involve a square root, but this advantage 
was achieved at the expense of imposing the condition (4.29) ad hoc after the equations of motion are 
derived. It is possible to retain the advantage of a Lagrangian quadratic in velocities, but get rid of the ad 
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hoc condition, by modifying the Lagrangian so that the ad hoc condition essentially emerges as an equation 
of motion. I call the resulting Lagrangian (4.31) the “nice” Lagrangian. 

As seen in §4.1, the equations of motion are independent of the choice of the arbitrary parameter A 
that labels the path of the particle between its fixed endpoints. The equations of motion are said to be 
reparametrization independent. Introduce, therefore, a scale factor a(A), an arbitrary function of 4, 
that rescales the parameter A, and let the action for a test particle of mass m be 


1 dz” dz” 
Sm = LV 2 dX ; 4.30 
fale ad\ad\ m?) a ae) 
with nice Lagrangian 
a dz” dz” 
Lig = 2 an ale 4.31 
2 («, ad\ad\ ” ) en 


Variation of the action (4.30) with respect to z” and dx“ /dX yields the Euler-Lagrange equations in the form 


d dg” dx" dx” 
E ype a ee 4.32 
ad\ (o <x) wea dada oo 
Variation of the action (4.30) with respect to the parameter a gives 
1 dx" dz” 2 
m = v d ; 4. 
ôS, [5 (-m SS m?) badd (4.33) 
and requiring that this be an extremum imposes 
da dz” 2 
wot r= 4.34 
I dd add a 
Equation (4.34) is equivalent to 
d 
eas ; (4.35) 
m 


where the sign has been taken positive without loss of generality. Substituting equation (4.35) into the 
equations of motion (4.32) recovers the usual equations of motion (4.15). 

Condition (4.35) substituted into the action (4.30) recovers the standard test particle action (4.7) with 
the correct sign and normalization. 


4.7 Action for a charged test particle in an electromagnetic field 


The equations of motion for a test particle of charge q in a prescribed gravitational and electromagnetic 
field can be obtained by adding to the test particle action Sm an interaction action Sq that characterizes the 
interaction between the charge and the electromagnetic field, 


S=SmtSq- (4.36) 
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In flat (Minkowski) space, experiment shows that the required equation of motion is the classical Lorentz 
force law (4.45). The Lorentz force law is recovered with the interaction action 


Af Àf dz” 
Sa =4 A, dz" = af A,— dà, (4.37) 
where A, is the electromagnetic 4-vector potential. The interaction Lagrangian L, corresponding to the 
action (4.37) is 
da" 

ar) ` 
If the electromagnetic potential A,, is taken to be a prescribed function of the coordinates z” along the 
path of the particle, then the Lagrangian L, (4.38) is a function of coordinates z” and velocities dx /dà 
as required by the action principle. The partial derivatives of the interaction Lagrangian Lg with respect to 
velocities and coordinates are 


Ly =4qA (4.38) 


AL, 
———.. = 9A, ; ý 
O(dx* /dX) 1 (4.39a) 
H 
OL, OA, dx OA, u” dr (4.39b) 


dx® Axe dy axe" dd” 
The generalized momentum 7,,, equation (4.6), of the test particle of mass m and charge q in the electro- 
magnetic field of potential A,, is, from equations (4.10a) and (4.39a), 


O(Lm + La) 


ye ie OAs, 4.40 
amo aa aay) 
Applied to the Lagrangian L = Lm + Lq, the Euler-Lagrange equations (4.5) are 

d y OA, \ dt 

D (mu, + qAx) = (mt yu + Van w) D (4.41) 
which rearranges to 

dmu, n 
ae pec A GE etl 5 (4.42) 


where the antisymmetric electromagnetic field tensor F),,, is defined to be the torsion-free covariant curl of 
the electromagnetic potential A,,, 

OA, OA, 
dr® ðr ` 
The definition (4.43) of the electromagnetic field holds even in the presence of torsion (see §16.5). Splitting 
the connection in equation (4.42) into its torsion-free part and the contortion, as done previously in equa- 
tion (4.14), yields the Lorentz force law for a test particle of mass m and charge q moving in a prescribed 
gravitational and electromagnetic field, with or without torsion, 


Fin = (4.43) 


= = QF |. (4.44) 
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Equation (4.44), which involves the torsion-free covariant derivative D/Dr, shows that the Lorentz force law 
is unaffected by the presence of torsion. 
In flat (Minkowski) space, the spatial components of equation (4.44) reduce to the classical special rela- 
tivistic Lorentz force law 
d 
= 4(E +x B). (4.45) 
In equation (4.45), p is the 3-momentum and v is the 3-velocity, related to the 4-momentum and 4-velocity 
by pë = {p',p} = mu* = mu'{1,v} (note that d/dt = (1/u‘)d/dr). In flat space, the components of the 
electric and magnetic fields E = {E,, Ey, Ez} and B = { Bz, By, Bz} are related to the electromagnetic field 
tensor Finn by (the signs in the expression (4.46) are arranged precisely so as to agree with the classical 
law (4.45)) 


0 =E; —E, —E, 0 Ey Ey E, 
E 0 B —B -E 0 B —B 

En = a a yY pe — £ z y 4.4 
E, -B 0 B l’ -E, -B, 0 B a6) 
E, By —B, 0 -E;, By —B, 0 


If the electromagnetic 4-potential A’ is written in terrms of a classical electric potential ø and electric 
3-vector potential A = { Az, Ay, Az}, 


A™ = {¢, A}, (4.47) 
then in flat space equation (4.43) reduces to the traditional relations for the electric and magnetic fields E 
and B in terms of the potentials ¢ and A, 


OA 
at ’ 
where V = {0/0xz, 0/Oy, 0/0z} is the spatial 3-gradient. 


E=-V¢ 


B=VxA, (4.48) 


4.8 Symmetries and constants of motion 


If a spacetime possesses a symmetry of some kind, then a test particle moving in that spacetime possesses 
an associated constant of motion. The Lagrangian formalism makes it transparent how to relate symmetries 
to constants of motion. 

Suppose that the Lagrangian of a particle has some spacetime symmetry, such as time translation symme- 
try, or spatial translation symmetry, or rotational symmetry. In a suitable coordinate system, the symmetry 
is expressed by the condition that the Lagrangian L is independent of some coordinate, call it €. In the case 
of time translation symmetry, for example, the coordinate would be a suitable time coordinate t. Coordinate 
independence requires that the metric guy, along with any other field, such as an electromagnetic field, that 
may affect the particle’s motion, is independent of the coordinate €. Then the Euler-Lagrangian equations 
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of motion (4.5) imply that the derivative of the covariant €-component mg of the conjugate momentum of 
the particle vanishes along the trajectory of the particle, 


ama 


age (4.49) 


Thus the covariant momentum 7 is a constant of motion, 


Scan) aso 


4.9 Conformal symmetries 


Sometimes the Lagrangian possesses a weaker kind of symmetry, called conformal symmetry, in which 
the Lagrangian L depends on a coordinate € only through an overall scaling of the Lagrangian, 


L=e%L, (4.51) 


where the conformal Lagrangian L is independent of €. The factor e§ is called a conformal factor. The 
Euler-Lagrangian equation of motion (4.5) for the conformal coordinate € is then 


dTe = OL = 
Dae” 2L. (4.52) 


As an example, consider a test particle moving in a spacetime with conformally symmetric metric 
Juv = e% gu , (4.53) 


where the conformal metric g,,, is independent of the coordinate €. The effective Lagrangian Li, of the test 
particle is given by equation (4.25). The equation of motion (4.52) becomes 


dpe _ , 2 
D 2La =—mMm. (4.54) 


If the test particle is massive, m Æ 0, then equation (4.54) integrates to 


pe =—mrT], (4.55) 


where a constant of integration has been absorbed, without loss of generality, into a shift of the zero point 
of the proper time 7 of the particle. If the test particle is massless, m = 0, then equation (4.54) implies that 


pe = constant . (4.56) 
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Exercise 4.4. Geodesics in Rindler space. The Rindler line-element (2.103) can be written 
ds? = e” (— da? + dé?) + dy? + dz? , (4.57) 
where the Rindler coordinates a and € are related to Minkowski coordinates t and x by 
t=e sinha, x«=e>cosha. (4.58) 


What are the constants of motion of a test particle? Integrate the Euler-Lagrange equations of motion. 
Solution. The Rindler metric is independent of the coordinates a, y, and z. The three corresponding 
constants of motion are 


Pa , Py ’ Pz * (4.59) 
A fourth integral of motion follows from conservation of rest mass 


ppu = -m° . (4.60) 


Figure 4.2 Rindler wedge of Minkowski space. Purple and blue lines are lines of constant Rindler time a and constant 
Rindler spatial coordinate € respectively. The grid of lines is equally spaced by 0.2 in each of a and &. The Rindler 
coordinates a and £, each extending over the interval (—oo, 00), cover only the x > |t| quadrant of Minkowski space. 
The fact that the Rindler metric is conformally Minkowski in a and £ (the line-element is proportional to — da? + dé?, 
equation (4.57)) shows up in the fact that small areal elements of the a—€ grid are rhombi with null (45°) diagonals. 
The straight black line is a representative geodesic. The solid dot marks the point where the geodesic goes through 
{ao,éo}. Open circles mark a = Foo, where the geodesic passes through the null boundaries t = + of the Rindler 
wedge. 
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Equation (4.60) rearranges to give 


e™$y (epa)? = p? , (4.61) 


where p is the positive constant 


= ape Pe me (4.62) 


Equation (4.61) integrates to give € as a function of À, 


es = Pa _ oe (4.63) 
= p2 H ; . 
where a constant of integration has been absorbed without loss of generality into a shift of the zero point of 
the affine parameter A along the trajectory of the particle. The coordinate € passes through its maximum 
value éo where \ = 0, at which point 


e£ = _ Pa ' (4.64) 
u 
the sign coming from the fact that pa = Jaap“ = —e?°5da/dàA must be negative, since the particle must move 


forward in Rindler time a. The trajectory is illustrated in Figure 4.2; the trajectory is of course a straight 
line in the parent Minkowski space. 
The evolution equation (4.63) for €(A) can be derived alternatively from the Euler-Lagrange equation for 


&, 


dpe 9 
ea? (4.65) 


The Euler-Lagrange equation (4.65) integrates to 
Pe =p’, (4.66) 


where a constant of integration has again been absorbed into a shift of the zero point of the affine parameter 
AÀ (this choice is consistent with the previous one). Given that pe = geep = e*Sdé/dA, equation (4.66) 
integrates to yield the same result (4.63), the constant of integration being established by the rest-mass 
relation (4.60). 

The evolution of Rindler time a along the particle’s trajectory follows from integrating pa = Jaap% = 
—e*§da/dX, which gives 


1 e80 + uÀ 
a — ao = sm (a ; (4.67) 


where ao is the value of a for A = 0, where € takes its maximum &9. The Rindler time coordinate a varies 
between limits Foo at A = eS. 
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4.10 (Super-)Hamiltonian 


The Lagrangian approach characterizes the paths of particles through spacetime in terms of their 4N coor- 
dinates z” and corresponding velocities dx“ /dX along those paths. The Hamiltonian approach on the other 
hand characterizes the paths of particles through spacetime in terms of 4N coordinates x” and the 4N gen- 
eralized momenta 7,,, which are treated as independent from the coordinates. In the Hamiltonian approach, 
the Hamiltonian H(a",7,,) is considered to be a function of coordinates and generalized momenta, and 
the action is minimized with respect to independent variations of those coordinates and momenta. In the 
Hamiltonian approach, the coordinates and momenta are treated essentially on an equal footing. 
The Hamiltonian H can be defined in terms of the Lagrangian L by 


— dæ” 

=™ a 7 
Here, as previously in §4.1, the parameter A is to be regarded as an arbitrary parameter that labels the 
path of the system through the 8N-dimensional phase space of coordinates and momenta of the N particles. 
Misner, Thorne, and Wheeler (1973) call the Hamiltonian (4.68) the super-Hamiltonian, to distinguish 
it from the conventional Hamiltonian, equation (4.74), where the parameter A is taken equal to the time 
coordinate t. Here however the super-Hamiltonian (4.68) is simply referred to as the Hamiltonian, for brevity. 

In terms of the Hamiltonian (4.68), the action (4.1) is 


At dz” 
s= f (n-a) dà. (4.69 


In accordance with Hamilton’s principle of least action, the action must be varied with respect to the 
coordinates and momenta along the path. The variation of the first term in the integrand of equation (4.69 
can be written 


H (4.68) 


25 =T, | bal" an 4. 
TUTA ia ge an Oe oe 


The middle term on the right hand side of equation (4.70) yields a surface term on integration. Thus the 
variation of the action is 


Af dr, OH dz” OH 
E LIAE _ Le u LA aa 
6S = [n őr" ] y +f { ( TA + r) bal + ( T zz) in) dr, (4.71) 


which takes into account that the Hamiltonian is to be considered a function H(x",7,,) of coordinates and 
momenta. The principle of least action requires that the action is a minimum with respect to variations of 
the coordinates and momenta along the paths of particles, the coordinates and momenta at the endpoints 
A; and A¢ of the integration being held fixed. Since the coordinates are fixed at the endpoints, x” = 0, the 
surface term in equation (4.71) vanishes. Minimization of the action with respect to arbitrary independent 
variations of the coordinates and momenta then yields Hamilton’s equations of motion 

dz" OH dt, OH 


= = i 4.12 
dX On,’ dÀ ðr” aa 


p u p m 
a( Fr) =o dbx dx d _ dt, 
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4.11 Conventional Hamiltonian 


The conventional Hamiltonian of classical mechanics is not the same as the super-Hamiltonian (4.68). In the 
conventional approach, the parameter A is set equal to the time coordinate t. The Lagrangian is taken to be 
a function L(t, x“, dx°/dt) of time t and of the 3N spatial coordinates 7° and 3N spatial velocities dx°/dt. 
The generalized momenta are defined to be, analogously to (4.6), 
OL 
Ta =~ 4.73 
“~ A(dx® /dt) ( ) 
The conventional Hamitonian is taken to be a function H(t, x“, ma) of time t and of the 3N spatial coor- 
dinates x“ and corresponding 3N generalized momenta ma. The conventional Hamiltonian is related to the 
conventional Lagrangian by 


Pa Ge L. (4.74) 


= = . 4.75 

dt OTe ’ dt Ox ( ) 

The advantage of the super-Hamiltonian (4.68) over the conventional Hamiltonian (4.74) in general rela- 
tivity will become apparent in the sections following. 


4.12 Conventional Hamiltonian for a test particle 


dz” dz” 
Lm = -M4 —guv D d (4.76) 


The corresponding test-particle Hamiltonian is supposedly given by equation (4.68). However, one runs into 
a difficulty. The Hamiltonian is supposed to be expressed in terms of coordinates x” and momenta p,. But 
the expression (4.68) for the Hamiltonian depends on the arbitrary parameter A, whereas as seen in §4.3 the 
coordinates x and momenta p, are (before the least action principle is applied) independent of the choice 
of \. For example, the square of the momentum (4.11) derived from the Lagrangian (4.8) is p p” = —m?, 


which is independent of the choice of A. There is no way to express the Hamiltonian in the prescribed form 


The test-particle Lagrangian (4.8) is 


without imposing some additional constraint on A. Two ways to achieve this are described in the next two 
sections, §4.13 and §4.14. 

A third approach is to revert to the conventional approach of fixing the arbitrary parameter A equal to 
coordinate time t. This choice eliminates the time coordinate and corresponding generalized momentum as 
parameters to be determined by the least action principle. It also breaks manifest covariance, by singling out 
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the time coordinate for special treatment. For simplicity, consider flat space, where the metric is Minkowski 
Nmn- The Lagrangian (4.76) becomes 


dxz™ dx” 
Lm = mn = 1 2 ; 4. 
m4/—n u ai m v (4.77) 


where v = yha? is the magnitude of the 3-velocity v4, 


dx* 
= ; 4.78 
dt ae) 
The generalized momentum ma defined by (4.73) equals the ordinary momentum pa, 
Mva 
Ta — a => ——. 4.79 
Pa = JT v2 ae 
The Hamiltonian (4.74) is 
m 
a l 4.80 
Viv? i 
Expressed in terms of the spatial momenta pa, the Hamiltonian is 
H = yP +m? , (4.81) 
where p = y nèpap is the magnitude of the 3-momentum pa. Hamilton’s equations (4.75) are 
d a a d a 
cate eee Ei (4.82) 


dt \/p2 +m? , dt 


The Hamiltonian (4.81) can be recognized as the energy of the particle, or minus the covariant time compo- 
nent of the 4-momentum, 


H = —p . (4.83) 


A similar, more complicated, analysis in curved space leads to the same conclusion, that the conventional 
Hamiltonian H is minus the covariant time component of the 4-momentum, 


H = -p . (4.84) 


The expression for the Hamiltonian in terms of spatial coordinates z“ and momenta pa can be inferred from 
conservation of rest mass, 


g” pup +m? =0. (4.85) 


Explicitly, the conventional Hamiltonian is 


1 Q 
TZN a G Pa + v (giag? — 9" 9%" pape — gi'm?) . (4.86) 


In the presence of an electromagnetic field, replace the momenta p; and pa in equation (4.86) by p, = 
Tu — gA,, and set the Hamiltonian equal to —7, 


H=-m. (4.87) 
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The super-Hamiltonians (4.90) and (4.96) derived in the next two sections are more elegant than the 
conventional Hamiltonian (4.86). All lead to the same equations of motion, but the super-Hamiltonian 
exhibits general covariance more clearly. 


4.13 Effective (super-)Hamiltonian for a test particle with electromagnetism 


In the effective approach, the condition (4.29) on the parameter À is applied after equations of motion are 
derived. The effective test-particle Lagrangian (4.25), coupled to electromagnetism, is 


1 da dx’ da" 
3I a a +q Hay? 
where the metric g,,, and electromagnetic potential A,, are considered to be given functions of the coordinates 
x”. The corresponding generalized momentum (4.6) is 


L= Lm + L= (4.88) 


dx” 
PS Iw ay GA « (4.89) 
The (super-)Hamiltonian (4.68) expressed in terms of coordinates z” and momenta m, as required is 
1 Vv 
H= a” (Tu — qAu (Tu — q4») . (4.90) 
Hamilton’s equations (4.72) are 
dx” dpr 
— =p" =T p Fup” , 4.91 
D TP Gy = Pane + Fep (4.9 
where p,, is defined by 
Pu = Ty — qA, : (4.92 


The equations of motion (4.91) having been derived from the Hamiltonian (4.90), the parameter A is set 
equal to the affine parameter in accordance with condition (4.29). In particular, the first of equations (4.91 
together with condition (4.29) implies that p” = mdz" /dr, as it should be. The equations of motion (4.91 
thus reproduce the equations (4.42) derived in Lagrangian approach. The value of the Hamiltonian (4.90 
after the equations of motion and condition (4.29) are imposed is constant, 


m2 


H=-—. 4. 
z (4.93 


Recall that the super-Hamiltonian H is a scalar, associated with rest mass, to be distinguished from the 
conventional Hamiltonian, which is the time component of a vector, associated with energy. The minus sign 
in equation (4.93) is associated with the choice of metric signature —+++, where scalar products of timelike 
quantities are negative. The negative Hamiltionian (4.93) signifies that the particle is propagating along a 
timelike direction. If the particle is massless, m = 0, then the Hamiltonian is zero (after equations of motion 
are imposed), signifying that the particle is propagating along a null direction. 
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The nice test-particle Lagrangian (4.31), coupled to electromagnetism, is 


a dz” dx” da" 
L= r : A ——. 4.94 
5 (or ET m?) +a "TIA a 
The corresponding generalized momentum (4.6) is 
dx” 
i —jue— An. 4.95 
Tu = Jw ay FqA, (4.95) 


The associated nice (super-)Hamiltonian (4.68) expressed in terms of coordinates z” and momenta T, as 
required is 


a V 
H = 5 [9 (mu — GAy)(m — Gv) +m?) | (4.96) 
The nice Hamiltonian H, equation (4.96), depends on the auxiliary scale factor a as well as on x and 7,,, 


and the action must be varied with respect to all of these to obtain all the equations of motion. Compared 
to the variation (4.71), the variation of the action contains an additional term proportional to da: 


At d OH dz” OH OH 
= pat Tu u 
6S = [n ðr") 4 [ { & + a bah + ( sm) bm = ôa} dÀ. (4.97) 


Requiring that the variation (4.97) of the action vanish under arbitrary variations of the coordinates z” and 


momenta 7,, yields Hamilton’s equations (4.72), which here are 


dz" dps 
= p" = Duu pp” + qF ep” , 4.98 
ady TP > gay TTP P + OF uP ( 
with p, defined by 
Pu = Ty — qQAy . (4.99 


The condition (4.103) found below, substituted into the first of Hamilton’s equations (4.98), implies that p” 
coincides with the usual ordinary momentum p” = m dx” /dr, as it should. Requiring that the variation (4.97 
of the action vanish under arbitrary variation of the parameter a yields the additional equation of motion 


ðH _ 


m S 4.100 
Oa ( 
The additional equation of motion (4.100) applied to the Hamiltonian (4.96) implies that 
a ee = qA (Tu = qA») =-m. (4.101) 


From the first of the equations of motion (4.98) along with the definition (4.99), equation (4.101) is the same 
as 
dx" dx” 2 


Iw AX aah =m ; (4.102) 
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which in turn is equivalent to 
d 
i=. (4.103) 


m 


recovering equation (4.35) derived using the Lagrangian formalism. Inserting the condition (4.103) into 
Hamilton’s equations (4.98) recovers the equations of motion (4.42) for a test particle in a prescribed gravi- 
tational and electromagnetic field. The value of the Hamiltonian (4.96) after the equation of motion (4.101) 
is imposed is zero, 


H=0. (4.104) 


4.15 Derivatives of the action 


Besides being a scalar whose minimum value between fixed endpoints defines the path between those points, 
the action S can also be treated as a function of its endpoints along the actual path. Along the actual path, 
the equations of motion are satisfied, so the integral in the variation (4.4) or (4.71) of the action vanishes 
identically. The surface term in the variation (4.4) or (4.71) then implies that 6.9 = m ôx”. This means that 
the partial derivatives of the action with respect to the coordinates are equal to the generalized momenta, 


Os 
Age TH: 
This is the basis of the Hamilton-Jacobi method for solving equations of motion, §4.16. 
By definition, the total derivative of the action S with respect to the arbitrary parameter À along the 
actual path equals the Lagrangian L. In addition to being a function of the coordinates x” along the actual 
path, the action may also be an explicit function S(A, 2") of the parameter A. The total derivative of the 
action along the path may thus be expressed 


ds =L= as + OS dz" 

dx OX Oak dà ` 
Comparing equation (4.106) to the definition (4.68) of the Hamiltonian shows that the partial derivative of 
the action with respect to the parameter A is minus the Hamiltonian 


OS _ 

oe 

In the conventional approach where the parameter is fixed equal to the time coordinate t, equa- 
tions (4.105) and (4.107) together show that 


(4.105) 


(4.106) 


-H. (4.107) 


as 
Z =n=-H, (4.108) 


in agreement with equation (4.87). In the super-Hamiltonian approach, the Hamiltonian H is constant, equal 
to —m?/2 in the effective approach, equation (4.93), and equal to zero in the nice approach, equation (4.104). 
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Concept question 4.5. Action vanishes along a null geodesic, but its gradient does not. How can 
it be that the gradient of the action p, = 0S/0x" is non-zero along a null geodesic, yet the variation of the 


action dS = —mdr is identically zero along the same null geodesic? Answer. This has to do with the fact 
that a vector can be finite yet null, 

dS dz" 0s 

D = IA Axl NËT —m? 0 form=0O0. (4.109) 


4.16 Hamilton-Jacobi equation 


The Hamilton-Jacobi equation provides a powerful way to solve equations of motion. The Hamilton-Jacobi 
equation proves to be separable in the Kerr-Newman geometry for an ideal rotating black hole, Chapter 23. 
The hypothesis that the Hamilton-Jacobi equation be separable provides one way to derive the Kerr-Newman 
line-element, Chapter 22, and to discover other separable spacetimes. 

The Hamilton-Jacobi equation is obtained by writing down the expression for the Hamiltonian H in terms 
of coordinates x” and generalized momenta 7,,, and replacing the Hamiltonian H by —0S/d, in accordance 
with equation (4.107), and the generalized momenta m, by 0S/Ox" in accordance with equation (4.105). 

For the effective Hamiltonian (4.90), the resulting Hamilton-Jacobi equation is 


as 1 as as 
= igi” A, | =gå 4.11 
aa 29 (= i a) (Z E a) i om 


whose left hand side is —m?/2, equation (4.93). For the nice Hamiltonian (4.96), the resulting Hamilton- 


Jacobi equation is 
Os 1 Os Os 
= HV ee es 2 
aðà 2 |o (= 1A, (= 79 m?]| , ay 


whose left hand side is zero, equation (4.104). The Hamilton-Jacobi equations (4.110) and (4.111) agree, as 
they should. The Hamilton-Jacobi equation (4.110) or (4.111) is a partial differential equation for the action 
S(A, z”). In spacetimes with sufficient symmetry, such as Kerr-Newman, the partial differential equation can 
be solved by separation of variables. This will be done in §22.3. 


4.17 Canonical transformations 


The Lagrangian equations of motion (4.5) take the same form regardless of the choice of coordinates z” of 
the underlying spacetime. This expresses general covariance: the form of the Lagrangian equations of motion 
is unchanged by general coordinate transformations. 

Coordinate transformations also preserve Hamilton’s equations of motion (4.72). But the Hamiltonian 
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formalism allows a wider range of transformations that preserve the form of Hamilton’s equations. Transfor- 
mations of the coordinates and momenta that preserve Hamilton’s equations are called canonical trans- 
formations. The construction of canonical transformations is addressed in §4.17.1. 

The wide range of possible canonical transformations means that the coordinates and momenta lose much 
of their original meaning as actual spacetime coordinates and momenta of particles. For example, there is 
a canonical transformation (4.117) that simply exchanges coordinates and their conjugate momenta. It is 
common therefore to refer to general systems of coordinates and momenta that satisfy Hamilton’s equations 
as generalized coordinates and generalized momenta, and to denote them by q” and p,, 


q, Pu- (4.112) 


4.17.1 Construction of canonical transformations 


Consider a canonical transformation of coordinates and momenta 


{q", Pu} > {4 (4, p), pu (a p)} - (4.113) 


By definition of canonical transformation, both the original and transformed sets of coordinates and momenta 
satisfy Hamilton’s equations. 

For the equations of motion to take Hamiltonian form, the original and transformed actions S and S” must 
take the form 


Af Àf 
S= | pidq' — Hdà, S= Í p dq” —H'd. (4.114) 
Ài i 

One way for the original and transformed coordinates and momenta to yield equivalent equations of motion 
is that the integrands of the actions differ by the total derivative dF of some function F, 


dF = p, dq" — p, dq” — (H — H’) dà . (4.115) 


When the actions S and S’ are varied, the difference in the variations is the difference in the variation of F 
between the initial and final points A; and A¢, which vanishes provided that whatever F depends on is held 
fixed on the initial and final points, 


5S — 6S’ = [Fi =0. (4.116) 


Because the variations of the actions are the same, the resulting equations of motion are equivalent. The 
function F is called the generator of the canonical transformation between the original and transformed 
coordinates. 

Given any function F(A, q, q’), equation (4.115) determines p,,, —p/,, and H — H” as partial derivatives of F 
with respect to q”, q’", and À. For example, the function F = X, i qq" generates a canonical transformation 
that simply trades coordinates and momenta, 

OF 


OF, 
oe E r ee e 
Pu = ðq! =q , Pu T dq! =—q . (4.117) 


The generating function F(A, q,q') depends on q” and q. Other generating functions depending on either 
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of q” or pı, and either of q'" or p/,, are obtained by subtracting p,q” and/or adding p/,q'" to F. For example, 
equation (4.115) can be rearranged as 


dG = p, dq’ + q" dp, — (H — Hd, (4.118) 


where G = F + pid” is now some function G(A, q, p’). For example, the function G(q, p’) = ean FE (a) Bis 
in which f”(q) is some function of the coordinates q” but not of the momenta p,, generates the canonical 
transformation 
OG of” , i oG 
=— = yp! , Bi SPR ig) 4.119 
Pu ðq! 2 ðq! Py q Opi, f (q) ( ) 
This is just a coordinate transformation q” > q” = f” (q). 
If the generator of a canonical transformation does not depend on the parameter A, then the Hamiltonians 
are the same in the original and transformed systems, 


H(q", pu) = H'(q"",p,,) - (4.120) 


In the super-Hamiltonian approach, where the parameter A is arbitrary, the Hamiltonian is without loss of 
generality independent of A, and there is no physical significance to canonical transformations generated by 
functions that depend on À. The super-Hamiltonian H(q",p,,) is then a scalar, invariant with respect to 
canonical transformations that do not depend explicitly on A. This contrasts with the conventional Hamil- 
tonian approach, where the parameter is set equal to the coordinate time t, and the conventional Ham- 
iltonian is the time component of a 4-vector, which varies under canonical transformations generated by 
functions that depend on time t. 


4.17.2 Evolution is a canonical transformation 


The evolution of the system from some initial hypersurface A = 0 to some final hypersurface A is itself a 
canonical transformation. This is evident from the fact that Hamilton’s equations (4.72) hold for any value of 
the parameter A, so in particular Hamilton’s equations are unchanged when initial coordinates and momenta 
q"(0) and p,,(0) are replaced by evolved values q” (A) and p, (A), 


q" (0) > qg” =q" (A), p0) > Pi, = p (à). (4.121) 


The action varies by the total derivative dS = p, dq” — H dA along the actual path of the system, equa- 
tion (4.106), so the initial and evolved actions differ by a total derivative, equation (4.115), 


dF = p, (0) dg” (0) — p, (A) dg" (A) — [H (0) — H(A)]dà = dS (0) — dS(A) . (4.122) 


Thus the canonical transformation from an initial A = 0 to a final À is generated by the difference in the 
actions along the actual path of the system, 


F = S(0) — S(A) . (4.123) 
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4.18 Symplectic structure 


The generalized coordinates q” and momenta p, of a dynamical system of particles have a geometrical struc- 
ture that transcends the geometrical structure of the underlying spacetime manifold. For N coordinates q” 
and N momenta p,,, the geometrical structure is a 2N-dimensional manifold called a symplectic manifold. 
A symplectic manifold is also called phase space, and the coordinates {q",p,,} of the manifold are called 
phase-space coordinates. 

A central property of a symplectic manifold is that the Hamiltonian dynamics define a scalar product with 
antisymmetric symplectic metric wij. Let z’ with i = 1, ..., 2N denote the combined set of 2N generalized 
coordinates and momenta {q", pu}, 


Ce ag ag SAG as Diy cs i) (4.124) 


Hamilton’s equations (4.72) can be written 
L Sg (4.125) 


where wtf is the antisymmetric symplectic metric (actually the inverse symplectic metric) 


1 if 2? =q" and zÍ = fg. 
w = ditn,j — ijn = 4 —1 if 24 =p, and zi = q" , (4.126) 
0 otherwise . 


As a matrix, the symplectic metric wtf is the 2N x 2N matrix 


w = ( n : ) : (4.127) 


where 1 denotes the N x N unit matrix. Inverting the inverse symplectic metric wt yields the symplectic 
metric wij, which is the same matrix but flipped in sign, 


wg = Hy = wy E (4.128) 


Let z be another set of generalized coordinates and momenta satisfying Hamilton’s equations with the same 
Hamiltonian H, 
r 

cR (4.129) 

dà Oz!) 
It is being assumed here that the Hamiltonian H does not depend explicitly on the parameter A. In the super- 
Hamiltonian approach, there is no loss of generality in taking the Hamiltonian H to be independent of A, 
since the parameter À is arbitrary, without physical significance. The important point about equation (4.129) 
is that the symplectic metric w is the same regardless of the choice of phase-space coordinates. Under a 
canonical transformation z’ — 2/"(z) of generalized coordinates and momenta, dz” /dA transforms as 


dz? Os dz" Oe" „ƏH 02" „0z! ƏH 
= = w = w zg 
dÀ Ozk dX Ozk Oz! — Ozk Oz! Oz! 


(4.130) 
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Comparing equations (4.129) and (4.130) shows that the symplectic matrix w’ is invariant under a canonical 


transformation, 
a OR „0z 
T= Ki 4.131 
i ak” Gal ( ) 
Equation (4.131) can be expressed as the invariance under canonical transformations of 
j0 O , 9 O 
wI ——— = ws —____ (4.132) 
Oz' Ozi Oz" Oz'I 
Equivalently, 
wij dz*dzI = wij dz"dz" . (4.133) 


The invariance of the symplectic metric wij under canonical transformations can be thought of as analogous 
to the invariance of the Minkowski metric nmn under Lorentz transformations. But whereas the Minkowski 
metric Nmn is symmetric, the symplectic metric wij is antisymmetric. 


4.19 Symplectic scalar product and Poisson brackets 


Let f(z‘) and g(z*) be two functions of phase-space coordinates z'. Their tangent vectors in the phase space 
are Of /Oz* and 0g/0z'. The symplectic scalar product of the tangent vectors defines the Poisson bracket 
of the two functions f and g, 


iy Of 3g _ OF Og _ Of Og 
Oz zi ðq” OD), OD), q” 


[f,gl =v (4.134) 


The invariance (4.132) of the symplectic metric implies that the Poisson bracket is a scalar, invariant under 
canonical transformations of the phase-space coordinates zt. The Poisson bracket is antisymmetric thanks 
to the antisymmetry of the symplectic metric wt, 


[f,9] =—l9, f] - (4.135) 


4.19.1 Poisson brackets of phase-space coordinates 


The Poisson brackets of the phase-space coordinates and momenta themselves satisfy 
[2,27] =w . (4.136) 
Explicitly in terms of the generalized coordinates and momenta q” and p,, 
la. p= 67, [aa] =90, [pup] =0. (4.137) 


Reinterpreting equations (4.137) as operator equations provides a path from classical to quantum mechanics. 
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4.20 (Super-)Hamiltonian as a generator of evolution 


The Poisson bracket of a function f(z’) with the Hamiltonian H is 


Of ƏH af OH 


H| = 4.138 
AE = gg Opp pu Ogi on 
Inserting Hamilton’s equations (4.72) implies 
dq” d d 


ðq dà Op, dà dà’ 


That is, the evolution of a function f(q",p,) of generalized coordinates and momenta is its Poisson bracket 
with the Hamiltonian H, 


df 
S = [F,H]. (4.140) 


Equation (4.140) shows that the (super-) Hamiltonian defined by equation (4.68) can be interpreted as gen- 
erating the evolution of the system. 

The same derivation holds in the conventional case where A is taken to be time t, but generically the 
function f(t,q°,pa) and conventional Hamiltonian H(t, q“, pa) must be allowed to be explicit functions of 
time t as well as of generalized spatial coordinates and momenta q“ and pa. Equation (4.140) becomes in 
the conventional case 


df _ of 
OOF ip a. (4.141) 


4.21 Infinitesimal canonical transformations 


A canonical transformation generated by G = qi, is the identity transformation, since it leaves the coordi- 
nates and momenta unchanged. Consider a canonical transformation with generator infinitesimally shifted 
from the identity transformation, with e an infinitesimal parameter, 


pyi ' 
G=q'p, +€g(@P) - (4.142) 
The resulting canonical transformation is, from equation (4.119), 
OG Og OG Og 
Wis OE a ale pe ryt 4.143 
aD, q" +e TA P= g e Pegi ( ) 


Because € is infinitesimal, the term ¢0g/Op/, can be replaced by €g/ðp, to linear order, yielding 


29 
ðq! ` 


Equations (4.144) imply that the changes ôp, and dq" in the coordinates and momenta under an infinitesimal 
canonical transformation (4.142) is their Poisson bracket with g, 


6p, = e[pu,g9], ôq” = e |q", g] . (4.145) 


Og 
m— e ! =p, — 4.144 
q q € Op, ? Pu P, ( ) 
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As a particular example, the evolution of the system under an infinitesimal change ôA in the parameter A 
is, in accordance with the evolutionary equation (4.140), generated by a canonical transformation with g in 
equation (4.142) set equal to the Hamiltonian H, 


5p. = Alpu, H], ög" = ÖA |g", H] (4.146) 


4.22 Constancy of phase-space volume under canonical transformations 


The invariance of the symplectic metric under canonical transformations implies the invariance of phase-space 
volume under canonical transformations. 
The volume V of a region of 2N-dimensional phase space is 


V= fa = [at de = fada" dp,...dpn , (4.147) 


integrated over the region. Under a canonical transformation zê + z'(z) of phase-space coordinates, the 
phase-space volume element dV transforms by the Jacobian of the transformation, which is the determinant 


|Az"/Az°|, 


ð li 
dV’ = ST dV . (4.148) 
But equation (4.131) implies that 
PP Əz" |, p |02 
ely jm [PZ (4.149) 
so the Jacobian must be 1 in absolute magnitude, 
Oz" 
Z =: (4.150) 


If the canonical transformation can be obtained by a continuous transformation from the identity, then the 
Jacobian must equal 1. As a particular case, the Jacobian equals 1 for the canonical transformation generated 
by evolution, §4.22.1, since evolution is continuous from initial to final conditions. 


4.22.1 Constancy of phase-space volume under evolution 


Since evolution is a canonical transformation, §4.17.2 and §4.21, phase-space volume V is preserved under 
evolution of the system. Each phase-space point inside the volume V evolves according to the equations 
of motion. As the system of points evolves, the region distorts, but the magnitude of the volume V of the 
region remains constant. The constancy of phase-space volume as it evolves was proved explicitly in 1871 by 
Boltzmann, who later referred to the result as “Liouville’s theorem” since the proof was based in part on a 
mathematical theorem proved by Liouville (see Nolte, 2010). 
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4.23 Poisson algebra of integrals of motion 


A function f(z‘) of the generalized coordinates and momenta is said to be an integral of motion if it is 
constant as the system evolves. In view of equation (4.140), a function f(z’) is an integral of motion if and 
only if its Poisson brackets with the Hamiltonian vanishes, 


[f,H] =0. (4.151) 


As a particular example, the antisymmetry of the Poisson bracket implies that the Poisson bracket of the 
Hamiltonian with itself is zero, 


H, A] =0, (4.152) 


so the Hamiltonian H is itself a constant of motion. The super-Hamiltonian H is a constant of motion in 
general, while the conventional Hamiltonian H is constant provided that it does not depend explicitly on 
time t. 

Suppose that f(z’) and g(z‘) are both integrals of motion. Then their Poisson brackets with each other is 
also an integral of motion, 


[[f, 9], H] = — [lg, H], f] - (LH, f],g] =0, (4.153) 


the first equality of which expresses the Jacobi identity, and the last equality of which follows because the 
Poisson bracket of each of f and g with the Hamiltonian H vanishes. The Poisson bracket of two integrals 
of motion f and g may or may not yield a further distinct integral of motion. A set of linearly independent 
integrals of motion whose Poisson brackets close forms a Lie algebra is called a Poisson algebra. 


Concept question 4.6. How many integrals of motion can there be? How many distinct integrals 
of motion can there be in a dynamical system described by N coordinates and N momenta? A distinct 
integral of motion is one that cannot be expressed as a function of the other integrals of motion (this is more 
stringent than the condition that the integrals be linearly independent). Answer. The dynamical motion of 
the system is described by a 1-dimensional line in a 2N-dimensional phase-space manifold consisting of the N 
coordinates and N momenta. Any constant of motion f(x",7,,) defines a (2N—1)-dimensional submanifold 
of the phase-space manifold. A 1-dimensional line can be the intersection of no more than 2N—1 distinct 
such submanifolds, so there can be at most 2N—1 distinct constants of motion. In the super-Hamiltonian 
formulation, the phase space of a single particle in 4 spacetime dimensions is 8-dimensional, and there are 
at most 7 distinct integrals of motion. A particle moving along a straight line in Minkowski space provides 
an example of a system with a full set of 7 integrals of motion: 4 integrals constitute the covariant energy- 
momentum 4-vector pm, and a further 3 integrals of motion comprise x° — v't = x7(0) where v? = p*/p° is 
the velocity, and z™ (0) is the origin of the line at t = 0. In the conventional Hamiltonian formulation, the 
phase space of a single particle is 6-dimensional, and there are at most 5 distinct integrals of motion. The 
apparent discrepancy in the number of integrals occurs because in the super-Hamiltonian formalism the time 
t and time component m; of the generalized momentum are treated as distinct variables whose equations 
of motion are determined by Hamilton’s equations, whereas in the conventional Hamiltonian formalism the 
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arbitrary parameter A is set equal to the time t, which is therefore no longer an independent variable, and 
the generalized momentum m+, which equals minus the conventional Hamiltonian H, equation (4.108), is 
eliminated as an independent variable by re-expressing it in terms of the spatial coordinates and momenta. 


Concept Questions 


. What evidence do astronomers currently accept as indicating the presence of a black hole in a system? 
. Why can astronomers measure the masses of supermassive black holes only in relatively nearby galaxies? 


3. To what extent (with what accuracy) are real black holes in our Universe described by the no-hair 


theorem? 
Does the no-hair theorem apply inside a black hole? 


. Black holes lose their hair on a light-crossing time. How long is a light-crossing time for a typical 


stellar-sized or supermassive astronomical black hole? 

Relativists say that the metric is g,,,, but they also say that the metric is ds? = g,,, dx“dx”. How can 
both statements be correct? 

The Schwarzschild geometry is said to describe the geometry of spacetime outside the surface of the 
Sun or Earth. But the Schwarzschild geometry supposedly describes non-rotating masses, whereas the 
Sun and Earth are rotating. If the Sun or Earth collapsed to a black hole conserving their mass M and 
angular momentum L, roughly what would the spin a/M = L/M? of the black hole be relative to the 
maximal spin a/M = 1 of a Kerr black hole? 

What happens at the horizon of a black hole? 

As cold matter becomes denser, it goes through the stages of being solid/liquid like a planet, then 
electron degenerate like a white dwarf, then neutron degenerate like a neutron star, then finally it 
collapses to a black hole. Why could there not be a denser state of matter, denser than a neutron star, 
that brings a star to rest inside its horizon? 


. How can an observer determine whether they are “at rest” in the Schwarzschild geometry? 
. An observer outside the horizon of a black hole never sees anything pass through the horizon, even to 


the end of the Universe. Does the black hole then ever actually collapse, if no one ever sees it do so? 


. If nothing can ever get out of a black hole, how does its gravity get out? 
. Why did Einstein believe that black holes could not exist in nature? 

. In what sense is a rotating black hole “stationary” but not “static”? 

. What is a white hole? Do they exist? 

. Could the expanding Universe be a white hole? 

. Could the Universe be the interior of a black hole? 
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You know the Schwarzschild metric for a black hole. What is the corresponding metric for a white hole? 
What is the best kind of black hole to fall into if you want to avoid being tidally torn apart? 

Why do astronomers often assume that the inner edge of an accretion disk around a black hole occurs 
at the innermost stable orbit? 

A collapsing star of uniform density has the geometry of a collapsing Friedmann-Lemaitre-Robertson- 
Walker cosmology. If a spatially flat FLRW cosmology corresponds to a star that starts from zero velocity 
at infinity, then to what do open or closed FLRW cosmologies correspond? 

Your friend falls into a black hole, and you watch her image freeze and redshift at the horizon. A shell 
of matter falls on to the black hole, increasing the mass of the black hole. What happens to the image 
of your friend? Does it disappear, or does it remain on the horizon? 

Is the singularity of a Reissner-Nordstrém black hole gravitationally attractive or repulsive? 

If you are a charged particle, which dominates near the singularity of the Reissner-Nordstré6m geometry, 
the electrical attraction/repulsion or the gravitational attraction /repulsion? 

Is a white hole gravitationally attractive or repulsive? 

What happens if you fall into a white hole? 

Which way does time go in Parallel Universes in the Reissner-Nordstr6m geometry? 

What does it mean that geodesics inside a black hole can have negative energy? 

Can geodesics have negative energy outside a black hole? How about inside the ergosphere? 
Physically, what causes mass inflation? 

Is mass inflation likely to occur inside real astronomical black holes? 

What happens at the X point, where the outgoing and ingoing inner horizons of the Reissner-Nordstr6m 
geometry intersect? 

Can a particle like an electron or proton, whose charge far exceeds its mass (in geometric units), be 
modelled as Reissner-Nordstrém black hole? 

Does it makes sense that a person might be at rest in the Kerr-Newman geometry? How would the 
Boyer-Lindquist coordinates of such a person vary along their worldline? 

In identifying M as the mass and a the angular momentum per unit mass of the black hole in the 
Boyer-Lindquist metric, why is it sufficient to consider the behaviour of the metric at r => co? 

Does space move faster than light inside the ergosphere? 

If space moves faster than light inside the ergosphere, why is the outer boundary of the ergosphere not 
a horizon? 

Do closed timelike curves make sense? 

What does Carter’s fourth integral of motion Q signify physically? 

What is special about a principal null congruence? 

Evaluated in the locally inertial frame of a principal null congruence, the spin-0 component of the Weyl 
scalar of the Kerr geometry is C = —M/(r—iacos 0)3, which looks like the Weyl scalar C = —M/r? of the 
Schwarzschild geometry but with radius r replaced by the complex radius r — ia cos 0. Is there something 
deep here? Can the Kerr geometry be constructed from the Schwarzschild geometry by complexifying 
the radial coordinate r? 


o oA Ww N 


What’s important? 


. Astronomical evidence suggests that stellar-sized and supermassive black holes exist ubiquitously in 


nature. 


. The no-hair theorem, and when and why it applies. 

. The physical picture of black holes as regions of spacetime where space is falling faster than light. 

. A physical understanding of how the metric of a black hole relates to its physical properties. 

. Penrose (conformal) diagrams. In particular, the Penrose diagrams of the various kinds of vacuum black 


hole: Schwarzschild, Reissner-Nordstrém, Kerr-Newman. 


. What really happens inside black holes. Collapse of a star. Mass inflation instability. 
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Observational Evidence for Black Holes 


It is beyond the intended scope of this book to discuss the extensive and rapidly evolving observational 
evidence for black holes in any detail. However, it is useful to summarize a few facts. 

1. Observational evidence supports the idea that black holes occur ubiquitously in nature. They are not 
observed directly, but reveal themselves through their effects on their surroundings. Two kinds of black 
hole are observed: stellar-sized black holes in x-ray binary systems, mostly in our own Milky Way galaxy, 
and supermassive black holes in Active Galactic Nuclei (AGN) found at the centres of our own and other 
galaxies. 

2. The primary evidence that astronomers accept as indicating the presence of a black hole is a lot of mass 
compacted into a tiny space. 


a. 


In an x-ray binary system, if the mass of the compact object exceeds 3 Mo, the maximum theoretical 
mass of a neutron star, then the object is considered to be a black hole. Many hundreds of x-ray 
binary systems are known in our Milky Way galaxy, but only tens of these have measured masses, 
and in about 20 the measured mass indicates a black hole (McClintock et al., 2011). 

Several tens of thousands of AGN have been catalogued, identified either in the radio, optical, 
or x-rays. But only in nearby galaxies can the mass of a supermassive black hole be measured 
directly. This is because it is only in nearby galaxies that the velocities of gas or stars can be 
measured sufficiently close to the nuclear centre to distinguish a regime where the velocity becomes 
constant, so that the mass can be attribute to an unresolved central point as opposed to a continuous 
distribution of stars. The masses of about 40 supermassive black holes have been measured in this 
way (Kormendy and Gebhardt, 2001). The masses range from the 4 x 10° Mo mass of the black hole 
at the centre of the Milky Way (Ghez et al., 2008; Gillessen et al., 2009) to the 6.6 + 0.4 x 10° Mo 
mass of the black hole at the centre of the M87 galaxy at the centre of the Virgo cluster at the 
centre of the Local Supercluster of galaxies (Gebhardt et al., 2011; Akiyama et al., 2019). 


3. Secondary evidences for the presence of a black hole are: 


a. 


b 
G 
d 


high luminosity; 


. non-stellar spectrum, extending from radio to gamma-rays; 


rapid variability. 


. relativistic jets. 
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Observational Evidence for Black Holes 
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Figure 5.1 The supermassive black hole in the M87 galaxy imaged by the Event Horizon Telescope (Akiyama et al., 
2019). 


Jets in AGN are often one-sided, and a few that are bright enough to be resolved at high angular 
resolution show superluminal motion. Both evidences indicate that jets are commonly relativistic, moving 
at close to the speed of light. There are a few cases of jets in x-ray binary systems, sometimes called 
microquasars. 


. Stellar-sized black holes are thought to be created in supernovae as the result of the core-collapse of 


stars more massive than about 25 Mọ (this number depends in part on uncertain computer simulations). 
Supermassive black holes are probably created initially in the same way, but they then grow by accretion 
of gas funnelled to the centre of the galaxy. The growth rates inferred from AGN luminosities are 
consistent with this picture. 


. Long gamma-ray bursts (lasting more than about 2 seconds) are associated observationally with su- 


pernovae. It is thought that in such bursts we are seeing the formation of a black hole. As the black 
hole gulps down the huge quantity of material needed to make it, it regurgitates a relativistic jet that 
punches through the envelope of the star. If the jet happens to be pointed in our direction, then we see 
it relativistically beamed as a gamma-ray burst. 


Astronomical black holes present the only realistic prospect for testing general relativity in the strong 
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field regime, since such fields cannot be reproduced in the laboratory. At the present time the obser- 
vational tests of general relativity from astronomical black holes are at best tentative. One test is the 
redshifting of 7 keV iron lines in a small number of AGN, notably MCG-6-30-15, which can be interpreted 
as being emitted by matter falling on to a rotating (Kerr) black hole. 

. The first direct detection of gravitational waves was with the Laser Interferometer Gravitational wave 
Observatory (LIGO) on 14 September 2015 (Abbott et al., 2016). The wave-form was consistent with 
the merger of two black holes of masses 29 and 36 Mo. 

. Before gravitational waves were detected directly, their existence was inferred from the gradual speed- 
ing up of the orbit of the Hulse-Taylor binary, which consists of two neutron stars, one of which, 
PSR1913+16, is a pulsar. The parameters of the orbit have been measured with exquisite precision, and 
the rate of orbital speed-up is in good agreement with the energy loss by quadrupole gravitational wave 
emission predicted by general relativity. 
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Ideal Black Holes 


6.1 Definition of a black hole 


What is a black hole? Doubtless you have heard the standard definition: It is a region whose gravity is so 
strong that not even light can escape. 

But why can light not escape from a black hole? A standard answer, which John Michell (1784) would 
have found familiar, is that the escape velocity exceeds the speed of light. But that answer brings to mind 
a Newtonian picture of light going up, turning around, and coming back down, that is altogether different 
from what general relativity actually predicts. 


Figure 6.1 The fish upstream can make way against the current, but the fish downstream is swept to the bottom of 
the waterfall (Art by Wildrose Hamilton). This painting appeared on the cover of the June 2008 issue of the American 
Journal of Physics (Hamilton and Lisle, 2008). A similar depiction appeared in Susskind (2003). 
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A better definition of a black hole is that it is a 


region where space is falling faster than light. 


Inside the horizon, light emitted outwards is carried inward by the faster-than-light inflow of space, like a 
fish trying but failing to swim up a waterfall, Figure 6.1. 

The definition may seem jarring. If space has no substance, how can it fall faster than light? It means that 
inside the horizon any locally inertial frame is compelled to fall to smaller radius as its proper time goes by. 
This fundamental fact is true regardless of the choice of coordinates. 

A similar concept of space moving arises in cosmology. Astronomers observe that the Universe is expand- 
ing. Cosmologists find it convenient to conceptualize the expansion by saying that space itself is expanding. 
For example, the picture that space expands makes it more straightforward, both conceptually and mathe- 
matically, to deal with regions of spacetime beyond the horizon, the surface of infinite redshift, of an observer. 


6.2 Ideal black hole 


The simplest kind of black hole, an ideal black hole, is one that is stationary, and electrovac outside its 
singularity. Electrovac means that the energy-momentum tensor T),,, is zero except for the contribution 
from a stationary electromagnetic field. The most important ideal black holes are those that extend to 
asymptotically flat empty space (Minkowski space) at infinity. There are ideal black hole solutions that do 
not asymptote to flat empty space, but most of these have little relevance to reality. The most important 
ideal black hole solutions that are not flat at infinity are those containing a non-zero cosmological constant. 

The next several chapters deal with ideal black holes in asymptotically flat space. The importance of ideal 
black holes stems from the no-hair theorem, discussed in the next section. The no-hair theorem has the 
consequence that, except during their initial collapse, or during a merger, real astronomical black holes are 
accurately described as ideal outside their horizons. 


6.3 No-hair theorem 


I will state and justify the no-hair theorem, but I will not prove it mathematically, since the proof is technical. 

The no-hair theorem states that a stationary black hole in asymptotically flat space is characterized by 
just three quantities: 

1. Mass M; 

2. Electric charge Q; 

3. Spin, usually parameterized by the angular momentum a per unit mass. 

The mechanism by which a black hole loses its hair is gravitational radiation. When initially formed, 
whether from the collapse of a massive star or from the merger of two black holes, a black hole will form a 
complicated, oscillating region of spacetime. But over the course of several light crossing times, the oscillations 
lose energy by gravitational radiation, and damp out, leaving a stationary black hole. 
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Real astronomical black holes are not isolated, and continue to accrete (cosmic microwave background 
photons, if nothing else). However, the timescale (a light crossing time) for oscillations to damp out by 
gravitational radiation is usually far shorter than the timescale for accretion, so in practice real black holes 
are extremely well described by no-hair solutions almost all of their lives. 

The physical reason that the no-hair theorem applies is that space is falling faster than light inside the 
horizon. Consequently, unlike a star, no energy can bubble up from below to replace the energy lost by 
gravitational radiation. The loss of energy by gravitational radiation brings the black hole to a state where it 
can no longer radiate gravitational energy. The properties of a no-hair black hole are characterized entirely 
by conserved quantities. 

As acorollary, the no-hair theorem does not apply from the inner horizon of a black hole inward, because 
space ceases to fall superluminally inside the inner horizon. 

If there exist other absolutely conserved quantities, such as magnetic charge (magnetic monopoles), or 
various supersymmetric charges in theories where supersymmetry is not broken, then the black hole will also 
be characterized by those quantities. 

Black holes are expected not to conserve quantities such as baryon or lepton number that are thought not 
to be absolutely conserved, even though they appear to be conserved in low energy physics. 

It is legitimate to think of the process of reaching a stationary state as analogous to reaching a condition 
of thermodynamic equilibrium, in which a macroscopic system is described by a small number of parameters 
associated with the conserved quantities of the system. 
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Schwarzschild Black Hole 


The Schwarzschild geometry was discovered by Karl Schwarzschild in late 1915 at essentially the same 
time that Einstein was arriving at his final version of the General Theory of Relativity. Schwarzschild 
was Director of the Astrophysical Observatory in Potsdam, perhaps the foremost astronomical position in 
Germany. Despite his position, he joined the German army at the outbreak of World War 1, and was serving 
on the front at the time of his discovery. Sadly, Schwarzschild contracted a rare skin disease on the front. 
Returning to Berlin, he died in May 1916 at the age of 42. 

The realisation that the Schwarzschild geometry describes a collapsed object, a black hole, was not under- 
stood by Einstein and his contemporaries. Understanding did not emerge until many decades later, in the 
late 1950s. Thorne (1994) gives a delightful popular account of the history. 


7.1 Schwarzschild metric 


The Schwarzschild metric was discovered first by Karl Schwarzschild (1916b), and then independently 
by Johannes Droste (1916). In a polar coordinate system {t,r,0, ġ}, and in geometric units c = G = 1, the 
Schwarzschild metric is 


2M 2M\~* 
ds? = — (1 E ) dt? + (1 - ) dr? + r7do* |, (7.1) 
r r 
where do? (this is the Landau & Lifshitz notation) is the metric of a unit 2-sphere, 
do” = dé? + sin? do? . (7:2) 


With units restored, the time-time component gi of the Schwarzschild metric is 


Jt =- (1 = | ; (7.3) 


cr 


The Schwarzschild geometry describes the simplest kind of black hole: a black hole with mass M, but no 
electric charge, and no spin. 
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The geometry describes not only a black hole, but also any empty space surrounding a spherically sym- 
metric mass. Thus the Schwarzschild geometry describes to a good approximation the spacetimes outside 
the surfaces of the Sun and the Earth. 

Comparison with the spherically symmetric Newtonian metric 


ds? = — (1+ 26)dé? + (1 — 26) (dr? + r?do”) (7.4) 
with Newtonian potential 
(r) = —— (7.5) 


establishes that the M in the Schwarzschild metric is to be interpreted as the mass of the black hole 
(Exercise 7.1). 

The Schwarzschild geometry is asymptotically flat, because the metric tends to the Minkowski metric in 
polar coordinates at large radius 


ds? + — dt? + dr? +r°do? as r> o. (7.6) 


Exercise 7.1. Schwarzschild metric in isotropic form. The Schwarzschild metric (7.1) does not have 
the same form as the spherically symmetric Newtonian metric (7.4). By a suitable transformation of the 
radial coordinate r, bring the Schwarzschild metric (7.1) to the isotropic form 


1- M/2R\’ 
ds? = — (n) dt? + (1 + M/2R)* (dR? + R?do?) . (7.7) 


What is the relation between R and r? Hence conclude that the identification (7.5) is correct, and therefore 
that M is indeed the mass of the black hole. Is the isotropic form (7.7) of the Schwarzschild metric valid 
inside the horizon? 


7.2 Stationary, static 


The Schwarzschild geometry is stationary. A spacetime is said to be stationary if and only if there exists 
a timelike coordinate t such that the metric is independent of t. In other words, the spacetime possesses 
time translation symmetry: the metric is unchanged by a time translation t —> t + tọ where tọ is some 
constant. Evidently the Schwarzschild metric (7.1) is independent of the timelike coordinate t, and is therefore 
stationary, time translation symmetric. 

As will be found below, §7.6, the Schwarzschild time coordinate t is timelike outside the horizon, but 
spacelike inside. Some authors therefore refer to the spacetime inside the horizon of a stationary black hole 
as being homogeneous. However, I think it is less confusing to refer to time translation symmetry, which is 
a single symmetry of the spacetime, by a single name, stationarity, everywhere in the spacetime. 

The Schwarzschild geometry is also static. A spacetime is static if and only if in addition to being 
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stationary with respect to a time coordinate t, spatial coordinates can be chosen that do not change along 
the direction of the tangent vector e+. This requires that the tangent vector e; be orthogonal to all the spatial 
tangent vectors ex 


€t ` €a = Gta = 0. (7.8) 


The Kerr geometry for a rotating black hole is an example of a geometry that is stationary but not static. If 
time t and azimuthal ġ coordinates are coordinates associated with time and azimuthal symmetry, then the 
scalar product ez: eg of their tangent vectors in the Kerr geometry is a non-vanishing scalar, §9.3. Physically, 
in a static geometry, a system of static observers, those who are at rest in static spatial coordinates, see each 
other to remain at rest as time passes. In a non-static geometry, no such system of static observers exists. 

The Gullstrand-Painlevé metric for the Schwarzschild geometry, discussed in §7.12, is an example of a 
metric that is stationary, since the metric coefficients are independent of the free-fall time tg, but not explicitly 
static. Observers at rest with respect to Gullstrand-Painlevé spatial coordinates fall into the black hole, and 
do not see each other as remaining at rest as time goes by. The Schwarzschild geometry is nevertheless static 
because there exist coordinates, the Schwarzschild coordinates, with respect to which the metric is explicitly 
static, gta = 0. The Schwarzschild time coordinate t is thus identified as a special one: it is the unique time 
coordinate with respect to which the Schwarzschild geometry is manifestly static. 


7.3 Spherically symmetric 


The Schwarzschild geometry is also spherically symmetric. This is evident from the fact that the angular 
part r?do? of the metric is the metric of a 2-sphere of radius r. This can be seen as follows. Consider the 
metric of ordinary flat 3-dimensional Euclidean space in Cartesian coordinates {x,y,z}: 


ds? = dx? + dy? + dz’ . (7.9) 


Convert to polar coordinates {r,0,¢}, defined so that 


x=rsin@cos¢ , (7.10a) 
y=rsinésing , (7.10b) 
z=rcosé. (7.10c) 


Substituting equations (7.10a) into the Euclidean metric (7.9) gives 
ds? = dr? + r?(d0? + sin?0 dd”) . (7.11) 
Restricting to a surface r = constant of constant radius then gives the metric of a 2-sphere of radius r 
ds? = r? (d0? + sin?6 do?) (7.12) 


as claimed. 
The radius r in Schwarzschild coordinates is the circumferential radius, defined such that the proper 
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circumference of the 2-sphere measured by observers at rest in Schwarzschild coordinates is 27r. This is a 
coordinate-invariant definition of the meaning of r, which implies that r is a scalar. 


7.4 Energy-momentum tensor 


It is straightforward (especially if you use a computer algebraic manipulation program) to follow the cookbook 
summarized in §2.25 to check that the Einstein tensor that follows from the Schwarzschild metric (7.1) is 
zero. Einstein’s equations then imply that the Schwarzschild geometry has zero energy-momentum tensor. 

If the Schwarzschild geometry is empty, should not the spacetime be flat, the Minkowski spacetime? There 
are two answers to this question. Firstly, the Schwarzschild geometry describes the geometry of empty space 
around a static spherically symmetric mass, such as the Sun or Earth. The geometry inside the spherically 
symmetric mass is described by some other metric, which connects continuously and differentiably (but not 
necessarily doubly differentiably, if the spherical object has an abrupt surface) to the Schwarzschild metric. 

The second answer is that the Schwarzschild geometry describes the geometry of a collapsed object, a 
black hole, which is singular at zero radius, r = 0, but is otherwise empty of energy-momentum. 


Exercise 7.2. Derivation of the Schwarzschild metric. There are neater and more insightful ways 
to derive it, but the Schwarzschild metric can be derived by turning a mathematical crank without the 
need for deeper conceptual understanding. Start with the assumption that the metric of a static, spherically 
symmetric object can be written in polar coordinates {t,r, 6,6} as 


ds? = — A(r) dt? + B(r) dr? + r?(d0? + sin?6 dd”) , (7.13) 


where A(r) and B(r) are some to-be-determined functions of radius r. Write down the components of the 
metric guv, and deduce its inverse g”. Compute all the components of the coordinate connections I),,,, 
equation (2.63). Of the 40 distinct connections, 9 should be non-vanishing. Compute all the components of 
the Riemann tensor R,.,,,,, equation (2.112). There should be 6 distinct non-zero components. Compute all 
the components of the Ricci tensor R,,,,, equation (2.121). There should be 4 distinct non-zero components. 
Now impose that the spacetime be empty, that is, the energy-momentum tensor is zero. Einstein’s equations 
then demand that the Ricci tensor vanishes identically. Use the requirement that g* Ry — g" Rpr = 0 to 
show that AB = 1. Then use g” Ra = 0 to derive the functional form of A. Finally, use the Newtonian limit 
—gu ~ 1 +28 with 6 = —GM/r, valid at large radius r, to fix A. 


7.5 Birkhoff’s theorem 


Birkhoff’s theorem, whose proof is deferred to Chapter 20, Exercise 20.2, states that the geometry of 
empty space surrounding a spherically symmetric matter distribution is the Schwarzschild geometry. That 
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is, if the metric is of the form 
ds? = A(t,r) dt? + B(t,r) dt dr + C(t,r) dr? + D(t,r) do? , (7.14) 


where the metric coefficients A, B, C, and D are allowed to be arbitrary functions of t and r, and if the 
energy momentum tensor vanishes, T,» = 0, outside some value of the circumferential radius r’ defined by 
r’? = D, then the geometry is necessarily Schwarzschild outside that radius. 

This means that if a mass undergoes spherically symmetric pulsations, then those pulsations do not affect 
the geometry of the surrounding spacetime. This reflects the fact that there are no spherically symmetric 
gravitational waves. 


7.6 Horizon 


The horizon of the Schwarzschild geometry lies at the Schwarzschild radius r = r, 


_ 2GM 


Ts ’ 
c2 


(7.15) 


where units of c and G have been momentarily restored. Where does this come from? The Schwarzschild 
metric shows that the scalar spacetime distance squared ds? along an interval at rest in Schwarzschild 
coordinates, dr = d0 = dọ = 0, is timelike, lightlike, or spacelike depending on whether the radius is greater 
than, equal to, or less than the Schwarzschild radius rs: 


y <0 ifr>r,, 
ds? =- (1-2) dt? =0 ifr=r,, (7.16) 
>0 ifr<r,. 


Since the worldline of a massive observer must be timelike, it follows that a massive observer can remain at 
rest only outside the horizon, r > r.. An object at rest at the horizon, r = r., follows a null geodesic, which 
is to say it is a possible worldline of a massless particle, a photon. Inside the horizon, r < r,, neither massive 
nor massless objects can remain at rest. To remain at rest, a particle inside the horizon would have to go 
faster than light. 

A full treatment of what is going on requires solving the geodesic equation in the Schwarzschild geometry, 
but the results may be anticipated already at this point. In effect, space is falling into the black hole. Outside 
the horizon, space is falling less than the speed of light; at the horizon space is falling at the speed of light; 
and inside the horizon, space is falling faster than light, carrying everything with it. This is why light cannot 
escape from a black hole: inside the horizon, space falls inward faster than light, carrying light inward even if 
that light is pointed radially outward. The statement that space is falling superluminally inside the horizon 
of a black hole is a coordinate-invariant statement: massive or massless particles are carried inward whatever 
their state of motion and whatever the coordinate system. 

Whereas an interval of coordinate time t switches from timelike outside the horizon to spacelike inside the 
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horizon, an interval of coordinate radius r does the opposite: it switches from spacelike to timelike: 


ee >0 ifr>r,, 
ds? = (1-2) dr? =o ifr=r,, (7.17) 
r ; 
<0 ifr<r,. 


It appears then that the Schwarzschild time and radial coordinates swap roles inside the horizon. Inside the 
horizon, the radial coordinate becomes timelike, meaning that it becomes a possible worldline of a massive 
observer. That is, a trajectory at fixed t and decreasing r is a possible worldline. Again this reflects the fact 
that space is falling faster than light inside the horizon. A person inside the horizon is inevitably compelled, 
as their proper time goes by, to move to smaller radial coordinate r. 


Concept question 7.3. Going forwards or backwards in time inside the horizon. Inside the horizon, 
can a person can go forwards or backwards in Schwarzschild time t? What does that mean? 


7.7 Proper time 


The proper time experienced by an observer at rest in Schwarzschild coordinates, dr = d0 = dd = 0, is 


dr = \/—ds? = (1 = an dt . (7.18) 


For an observer at rest at infinity, r > oo, the proper time is the same as the coordinate time, 
dr > dt asr> oœ. (7.19) 


Among other things, this implies that the Schwarzschild time coordinate t is a scalar: not only is it the 
unique coordinate with respect to which the metric is manifestly static, but it coincides with the proper time 
of observers at rest at infinity. This coordinate-invariant definition of Schwarzschild time t implies that it is 
a scalar. 

At finite radii outside the horizon, r > r., the proper time dr is less than the Schwarzschild time dt, so 
the clocks of observers at rest run slower at smaller than at larger radii. 

At the horizon, r = r,, the proper time dr of an observer at rest goes to zero, 


dr>0 as rors. (7.20) 


This reflects the fact that an object at rest at the horizon is following a null geodesic, and as such experiences 
zero proper time. 
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7.8 Redshift 


An observer at rest at infinity looking through a telescope at an emitter at rest at radius r sees the emitter 
redshifted by a factor 


Lz 2ds — Mem _ Mobs (1 . (7.21) 
Aem Vobs Tem r 

This is an example of the universally valid statement that photons are good clocks: the redshift factor is given 

by the rate at which the emitter’s clock appears to tick relative to the observer’s own clock. Equation (7.21) 

is an example of the general formula (2.101) for the redshift between two comoving (= rest) observers in a 

stationary spacetime. 

It should be emphasized that the redshift factor (7.21) is valid only for an observer and an emitter at rest 
in the Schwarzschild geometry. If the observer and emitter are not at rest, then additional special relativistic 
factors will fold into the redshift. 

The redshift goes to infinity for an emitter at the horizon 


1l+z-00 as r>rs. (7.22) 


Here the redshift tends to infinity regardless of the motion of the observer or emitter. An observer watching 
an emitter fall through the horizon will see the emitter appear to freeze at the horizon, becoming ever slower 
and more redshifted. Physically, photons emitted vertically upward at the horizon by an infaller remain at 
the horizon for ever, taking an infinite time to get out to the outside observer. 


7.9 “Schwarzschild singularity” 


The apparent singularity in the Schwarzschild metric at the horizon r, is not a real singularity, because it 
can be removed by a change of coordinates, such as to Gullstrand-Painlevé coordinates, equation (7.27). 
Einstein, and other influential physicists such as Eddington, failed to appreciate this. Einstein thought that 
the “Schwarzschild singularity” at r = r, marked the physical boundary of the Schwarzschild spacetime. 
After all, an outside observer watching stuff fall in never sees anything beyond that boundary. 

Schwarzschild’s choice of coordinates was certainly a natural one. It was natural to search for static 
solutions, and his time coordinate t is the only one with respect to which the metric is manifestly static. 
The problem is that physically there can be no static observers inside the horizon: they must necessarily fall 
inward as time passes. The fact that Schwarzschild’s coordinate system shows an apparent singularity at the 
horizon reflects the fact that the assumption of a static spacetime necessarily breaks down at the horizon, 
where space is falling at the speed of light. 

Does stuff “actually” fall in, even though no outside observer ever sees it happen? The answer is yes: when 
a black hole forms, it does actually collapse, and when an observer falls through the horizon, they really do 
fall through the horizon. The reason that an outside observer sees everything freeze at the horizon is simply 
a light travel time effect: it takes an infinite time for light to lift off the horizon and make it to the outside 
world. 
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7.10 Weyl tensor 


For Schwarzschild, the Einstein tensor vanishes identically (because the spacetime is by assumption empty of 
energy-momentum). The only part of the Riemann curvature tensor that does not vanish is the Weyl tensor. 
The non-vanishing Weyl tensor says that gravitational tidal forces are present, even though the spacetime 
is empty of energy-momentum. Non-vanishing gravitational tidal forces are the signature that spacetime is 
curved. 

The covariant (all indices down) components C',,,,, of the coordinate-frame Weyl tensor of the Schwarz- 
schild geometry, computed from equation (3.1), appear at first sight to be a mess (go ahead, compute them). 
However, the mess is an artefact of looking at the tensor through the distorting lens of the coordinate basis 
vectors e,,, which are not orthonormal. After tetrads, Chapter 11, it will be found that the 10 components 
of the Weyl tensor, the tidal part of the Riemann tensor, can be decomposed in any locally inertial frame 
into 5 complex components of spin 0, +1, and +2. In a locally inertial frame whose radial direction coincides 
with the radial direction of the Schwarzschild metric, all components of the Weyl tensor of the Schwarzschild 
geometry vanish except the real spin-O component. Spin 0 means that the Weyl tensor is unchanged under 
a spatial rotation about the radial direction (and it is also unchanged by a Lorentz boost in the radial di- 
rection). This spin-0 component is a coordinate-invariant scalar, the Weyl scalar C. The fact that the Weyl 
tensor of the Schwarzschild geometry has only a single independent non-vanishing component is plausible 
from the fact that the non-zero components of the coordinate-frame Weyl tensor written with two indices 
up and two indices down are (no implicit summation over repeated indices) 


— 1C" = GO pg = C" i0 = Og = CO" r0 =C =O = (7.23) 


where C is the Weyl scalar, 


a, (7.24) 


r3 


The trick of writing the 4-index Weyl tensor with 2 indices up and 2 indices down, in order to reveal a simple 
pattern, works in a simple spacetime like Schwarzschild, but fails in more complicated spacetimes. 


7.11 Singularity 
The Weyl scalar, equation (7.24), goes to infinity at zero radius, 
Cc as r>0. (7.25) 


The diverging Weyl tensor implies that the tidal force diverges at zero radius, signalling that there is a 
genuine singularity at zero radius in the Schwarzschild geometry. 
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Horizon Horizon 


/ 


Visible Visible 


Figure 7.1 (Left) The light (yellow) shaded region shows the region visible to an infaller (blue) who falls radially to 
the singularity of a Schwarzschild black hole; the dark (grey) shaded region shows the region that remains invisible 
to the infaller. The invisible region has the shape of a cardioid, equation (7.62). If another infaller (purple) falls along 
a different radial direction, the two infallers not only fail to meet at the singularity, they lose causal contact with 
each other already some distance from the singularity. Since the two infallers fall to two causally disconnected places, 
the singularity cannot be a point. (Right) Same, showing the shortest causal path (red) joining the two infallers 
asymptotically near the singularity. The shortest causal path is a pair of light rays that start at the starred point, 
move in opposite azimuthal directions, and reach the infallers asymptotically near the singularity. The shortest causal 
path remains non-zero even though the spatial distance between the infallers tends to zero. Compare to Figure 23.2 
for a Kerr black hole. 


Concept question 7.4. Is the singularity of a Schwarzschild black hole a point? Is the singularity 
at the centre of the Schwarzschild geometry a point? Answer. No. Familiar experience in 3-dimensional 
space would suggest the answer is yes, but that conception is misleading. In the first place, general relativity 
fails at singularities: the locally inertial description of spacetime fails, and general relativity cannot continue 
worldlines of infallers beyond a singularity. Therefore singularities are not part of the spacetime described by 
general relativity. Presumably some other physical theory takes over at singularities, but what that theory 
is remains equivocal at the present time. In the second place, infallers who fall into a Schwarzschild black 
hole at different angular positions do not approach each other as they approach the singularity. Rather, 
the diverging tidal force near the singularity funnels each infaller along radially converging lines, effectively 
keeping the infallers isolated from each other. Moreover, the future lightcones of infallers who fall in at the 
same time t but at different angular positions cease to intersect, once they are close enough to the singularity. 
Thus the infallers not only fail to touch each other, they cease even to be able to communicate with each other 
as they approach the singularity, as illustrated in Figure 7.1. The reader may object that the Schwarzschild 
metric shows that the proper angular distance between two observers separated by angle ¢ is rdo, which 
goes to zero at the singularity r — 0. This objection fails because infallers approaching the singularity cease 
to be able to measure angular distances, since angularly separated points cease to be causally accessible to 
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the infaller. The region accessible to an infaller is cusp-like near the singularity. See Exercises 7.10 and 7.11 
for a more quantitative treatment of this problem. 


Concept question 7.5. Separation between infallers who fall in at different times. Consider two 
infallers who free-fall radially into the black hole at the same angular position, but at different times t. What 
is the proper spatial radial separation between the two observers at the instants they hit the singularity, at 
r — 0? Answer. Infinity. At the same angular position, dð = dd, the proper radial separation is 


di = vds =,/"2 -1dt>00 asr>0. (7.26) 
r 


7.12 Gullstrand-Painlevé metric 


An alternative metric for the Schwarzschild geometry was discovered independently by Allvar Gullstrand and 
Paul Painlevé in 1921 (Gullstrand, 1922; Painlevé, 1921). (Gullstrand has priority because his paper, though 
published in 1922, was submitted in May 1921, whereas Painlevé’s paper was a write-up of a presentation 
to L’ Académie des Sciences in Paris in October 1921). After tetrads, it will become clear that the standard 


v v Z 


. Horizon 


Figure 7.2 The Gullstrand-Painlevé metric for the Schwarzschild geometry encodes locally inertial frames (tetrads) 
that free-fall radially into the black hole at the Newtonian escape velocity 8, equation (7.28). The infall velocity is 
less than the speed of light outside the horizon, equal to the speed of light at the horizon, and faster than light inside 
the horizon. The infall velocity tends to infinity at the central singularity. 
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way in which metrics are written encodes not only metric but also a tetrad. The Gullstrand-Painlevé line- 
element (7.27) encodes a tetrad that represents locally inertial frames free-falling radially into the black hole 
at the Newtonian escape velocity, Figure 7.2, although at the time no one, including Einstein, Gullstrand, 
and Painlevé, understood this. Unlike Schwarzschild coordinates, there is no singularity at the horizon 
in Gullstrand-Painlevé coordinates. It is striking that the mathematics was known long before physical 
understanding emerged. 

The Gullstrand-Painlevé metric is 


ds? = — dt? + (dr — Bdtg)* + r7do? |. (7.27) 


Here is the Newtonian escape velocity (with a minus sign because space is falling inward), 


yA 1/2 
f=- (=) ; (7.28) 


and tg is the proper time experienced by an object that free falls radially inward from zero velocity at infinity. 
The free fall time tg is related to the Schwarzschild time coordinate t by 


which integrates to 


Jà 


(7.29) 


te =t4+rs (2vaTr +m ares ; (7.30) 


The time axis e;,, in Gullstrand-Painlevé coordinates is not orthogonal to the radial axis e,, but rather is 
tilted along the radial axis, e:,, © €r = gter = — 6. 
The proper time of a person at rest in Gullstrand-Painlevé coordinates, dr = d0 = dd = 0, is 


dr = dtg y/1 — B? . (7.31) 


The horizon occurs where this proper time vanishes, which happens when the infall velocity 8 is the speed 
of light 


|B) =1. (7.32) 


According to equation (7.28), this happens at r = rs, which is the Schwarzschild radius, as it should be. 


Exercise 7.6. Geodesics in the Schwarzschild geometry. The Schwarzschild metric is 


ds* = — A(r) dt? + x dr? + r? (d0? + sin?6 do?) , (7.33) 
r 
where A(r) is the horizon function 
2M 
A(r) =1-— —. (7.34) 
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Constants of motion. Argue that, without loss of generality, the trajectory of a freely falling particle 
may be taken to lie in the equatorial plane, 9 = 7/2. Argue that, for a massive particle, conservation of 
energy per unit rest mass E, angular momentum per unit rest mass L, and rest mass per unit rest mass 
implies that the 4-velocity u” = dx" /dr satisfies 


Ut = —E 5 (7.35a) 
teed. (7.35b) 
upu” = —1. (7.35¢) 


. Effective potential. Show that the radial component u” of the 4-velocity satisfies 


u =+ (E-U)? , (7.36) 


where U is the effective potential 
L? 


Proper time in radial free-fall. What is the proper time 7 for an observer to free-fall from radius 
r to the singularity at zero radius, for the particular case of an observer who falls radially from rest 
at infinity. |Hint: What are the energy E and angular momentum L for an observer who falls radially 
starting from rest at infinity?| 

Proper time in radial free-fall — numbers. Evaluate the proper time, in seconds, to fall from the 
horizon to the singularity in the case of a black hole with the mass 4 x 10° Mo of the black hole at the 
centre of our Galaxy, the Milky Way. 

Circular orbits. Circular orbits occur where the effective potential U is an extremum. Find the radii 
at which this occurs, as a function of angular momentum L. Solutions exist only if the absolute value 
|L| of the angular momentum exceeds a certain critical value Le. What is this critical value L,? 
Graph. Graph the effective potential U for values of L (i) less than, (ii) equal to, (iii) greater than the 
critical value Le. Describe physically, in words, what the possible orbital trajectories are for the various 
cases. [Hint: For cases (i) and (iii), values near the critical value Le show the distinction most clearly.] 
Range of orbits. Identify the ranges of radii over which circular orbits are: (i) stable, (ii) unstable, (iii) 
non-existent. [Hint: Stability depends on whether the extremum of the effective potential is a minimum 
or a maximum. Which is which? You will find it helps to consider U as a function of 1/r rather than r.] 
Angular momentum and energy in circular orbit. Show that the angular momentum per unit 
mass for a circular orbit at radius r satisfies 


r 


|i = — z. (7.38) 
(r/M - 3)? 
and hence show also that the energy per unit mass in the circular orbit is 
— 2M 
ja ee ON (7.39) 


[r(r — 3M)? 
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9. Drop in orbit. There is a certain circular orbit that has the same energy as a massive particle at rest 
at infinity. This is useful for starship captains to know, because it is possible to drop into this orbit 
using only a small amount of energy. What is the radius of the orbit? Is it stable or unstable? 

10. Orbital period. Show that the orbital period t, as measured by an observer at rest at infinity, of a 
particle in circular orbit at radius r is given by Kepler’s 3rd law (remarkably, Kepler’s 3rd law remains 
true even in the fully general relativistic case, as long as t is taken to be the time measured at infinity), 


GM? , 
——— = A 4 
(on? r (7.40) 


[Hint: Argue that the azimuthal angle ¢ evolves according to d¢/dt = u? Jut = LA/(Er?).] 


Exercise 7.7. Null geodesics in the Schwarzschild geometry. The orbit equations (7.35) would appear 
to break down for photons, which have zero mass, hence infinite energy per unit mass F, and infinite angular 
momentum per unit mass L. Another way of looking at this is that photons follow null geodesics, dr = 0, so 
that 7, which does not change, is not a very useful time coordinate for expressing the equations of motion 
of photons. The difficulty is cured by introducing an affine parameter, equation (2.93), which functions as a 
good scalar coordinate along null geodesics. 

1. Constants of motion. For a massless particle, the 4-velocity v” = dz“ /dA, normalized to unit energy 

at infinity, satisfies 


eat (7.41a) 
v= d ; (7.41b) 
vi =0, (7.41c) 


where J = L/E is the photon’s angular momentum per unit energy. 
2. Effective potential. Show that the radial component v” of the photon 4-velocity satisfies 


v=+(1-V)'? , (7.42) 


where y is the effective potential 
V —. g 7.4 


3. Photon sphere. Circular orbits occur where the effective potential V is a minimum (stable orbit) or a 
maximum (unstable orbit). At what radius can photons orbit in circles? Is the orbit stable or unstable? 
4. Photon energy. The photon energy —v;, equation (7.41a), is normalized to one as measured by an 
observer at rest at infinity. Show that the energy of the photon measured by an observer on a trajectory 
with energy E per unit mass and angular momentum per unit mass L, relative to unit energy at infinity, 


E 1 L2 J? L.J 
Wobs = Upu" = — At y(r — (1 + =) a) (1 2 a) | 2 (7.44) 


where the + sign is the sign of u’v", which is positive or negative as the observer and photon are moving 


radially in the same or opposite directions. 
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Exercise 7.8. Geodesics in the Schwarzschild geometry in 3 or more dimensions. Standard general 
relativity breaks down in N = 2 spacetime dimensions, §11.19, and there are no black holes in N = 2 
spacetime dimensions in the closest approximation to general relativity, Exercise 11.9 (there are however 
black holes in N = 2 spacetime dimensions in extensions of general relativity). The Schwarzschild metric in 
N > 3 spacetime dimensions is 


1 
A(r) 


where do? is the metric of a unit N—2 sphere, and A(r) is the horizon function 


ds? = — A(r) dt? + dr? +r? do? , (7.45) 


2M 
A(r) =l1- yN-3 ” (7.46) 
What happens when N = 3? What happens when N > 5? Argue that equations (7.35)-(7.37) hold, with A 
in the effective potential U, equation (7.37), being given by equation (7.46). 
Solution. For N = 3, the horizon function 7.46 is constant A = 1 — 2M. For N = 3, a coordinate 


transformation to coordinates t = tVA and r’ = r/VA brings the Schwarzschild line-element (7.45) to 
ds? = — dt? + dr? + r°A do’ , (7.47) 


which is the metric of a cone, with angle 2A around a circumference. The spacetime looks flat except for 
a conical vertex at r’ = 0. A mass M bends geodesics around it, but there are no bound orbits. 

The condition for a circular orbit is that the effective potential be an extremum, dU/dr = 0. The boundary 
between stable and unstable circular orbits occurs when the potential is a double extremum, dU/dr = 
d?U/dr? = 0. The boundary between stable and unstable circular orbits occurs at 


l N—1\VQ-3) Le N — 1\ ©-N)/2-3)] 
rea (NaI) g (xa) ra 


Ts 5—N Ts 5—N 


which has real finite solutions only for 2 < N < 4. For N = 2, equations (7.48) do not apply. For N = 3, 
equations (7.48) give r-/r; = e and L,/rs = e (where e is the exponential); but these values are really valid 
not for N = 3, but rather for values of N infinitesimally close to but not equal to 3. 

For N > 5, there are no stable circular orbits. For N > 5, the only circular orbits are unstable, which 
occur for L > 1 if N=5 or L > Oif N > 6. Besides unstable circular orbits, there are unbound geodesics, 
and geodesics that fall into the black hole. The case N = 4 is the only dimension for which stable circular 
orbits exist. 


Exercise 7.9. General relativistic precession of Mercury. 
1. Conclude from Exercise 7.6 that the 4-velocity u = dx" /dr of a massive particle on a geodesic in the 
equatorial plane of the Schwarzschild geometry satisfies 


1/2 
a, (ee. w= [e — (1 + =4)| : (7.49) 
r 
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. Letting x = 1/r, show that 
fx f L dz 
[(B2 — 1) + 2Mz — Lx? + 2ML223}'/? 


(7.50) 


[Hint: This is a straightforward application of equations (7.49). Do not try to solve this integral; leave 
it as given above.| 


. Suppose that the orbit varies between a known periapsis r_ and apoapsis r}. Define x- = 1/r_ and 
v4 = 1/r}4 (note that r_ < r4} so x— > x4). Argue that equation (7.50) must take the form 
dx 
O) =| : (7.51) 
[(e — 24)(e_ — z)(a — 2M1)? 
where 
a=1—2M(z_ + z+). (7.52) 


[Hint: This is not hard, but there are two things to do. First, you have to argue that, given the assumption 
that the orbit is a bounded stable orbit, there must be 3 real roots to the cubic, which must be ordered 
as 0 < 24 < x- <a/2M < œ. Second, you should compare the coefficients of x? and x? in the cubic 
in the integrands of (7.50) and (7.51)]. 

. By the transformation 


z =z} +(x- — zr4)y (7.53) 
bring the integral (7.51) to the form 
dy 
= / (7.54) 
[y(t — y) la = py)) 
where 
p=2M(a_-—24), q=1-—2M(#_ +204). (7.55) 


. Argue that the angle ¢ integrated around a full period, from apoapsis at y = 0 to periapsis at y = 1 
and back, is 


4 
= gi K(p/q) , (7.56) 
where K(m) is the complete elliptic integral of the first kind, one definition of which is 

ie ee dy 

K(m) = 5 I "a (7.57) 
0 [y(1—y)(1 — my)] 
. The Taylor series expansion of the complete elliptic integral is 

K(m) = S (1 er T $ > . (7.58) 


Argue that to linear order in mass M, the angle around a full period is 


o=2n+30M(c_ +24). (7.59) 
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7. Calculate the predicted precession of the perihelion of the orbit of Mercury, expressing your answer in 
arcseconds per century. Google the perihelion and aphelion distances of Mercury, and its orbital period. 


Exercise 7.10. A body cannot remain rigid as it approaches the Schwarzschild singularity. You 
have already found from Exercise 7.6 that the azimuthal angle ¢ at radius r of a particle of rest mass m on 
a geodesic with energy E and azimuthal angular momentum L in the equatorial plane of the Schwarzschild 
geometry satisfies 

J= Ldr 
E / (E2 = m2 )\r4 — D2 Pr? 


(7.60) 


1. Define J = L/E to be the angular momentum per unit energy. Argue that for photons, which are 


J dr 
p= / Vrt — J2r2X | ee 


2. Argue that inside the horizon (A < 0) the largest possible rate of change ddé/dr of the azimuthal angle 
ġ with respect to radius r occurs for J > co. 


massless, 


Horizon 


Figure 7.3 The arrowed lines, which are initially parallel, represent the worldtube of a body that remains as rigid as 
possible (having constant cross-sectional radius h) as it falls to the singularity at the centre of a Schwarzschild black 
hole. (The blow-up at right shows some details.) The dashed (purple) line shows geodesics with the maximum possible 
angular motion inside the horizon, namely null geodesics with infinite angular momentum per unit energy, J = oo. 
Since the walls of the infalling body cannot exceed the speed of light, their horizontal motion near the singularity is 
bounded by that of J = œo null geodesics, as illustrated. The diagram gives the impression that the different (left and 
right) sides of the worldtube encounter each other at the singularity, but this is false. The left side of the tube can 
send a signal to the right side only as long as the two sides are connected by a J = œœ null geodesic. The dashed line, 
marked with filled dots where the signal is emitted by the left side and observed by the right side, is the last such 
geodesic connecting the two sides: inside this dashed line the left side can no longer influence causally the right side. 
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3. Show that a null geodesic with J = oo in a Schwarzschild black hole satisfies 
r = r, sin? (ġ/2) . (7.62) 


Equation (7.62) is the equation of a cardioid, illustrated by the dashed purple lines in Figure 7.3. 
4. Parameterize the J = œo null geodesic satisfying equation (7.62) by {x,y} = {r cos ġ,r sin ¢}. Show that 
E = tan (3¢/2) . (7.63) 
Sketch the situation geometrically. Conclude that the radius h of a cylinder whose centre falls radially 
must satisfy h < rsin(¢/2) in order that the walls of the cylinder not exceed the speed of light. 
Equivalently, conclude that a cylinder of radius h can remain rigid only down to a radius r satisfying 


h < r?°’? fri? | (7.64) 


5. Do the parts of a body that falls into a Schwarzschild black hole encounter each other at the singularity? 
Solution. See Figure 7.3. The answer to part 5 is no, parts of a body that fall into a Schwarzschild black hole 
do not encounter each other at the singularity. Indeed, as illustrated in Figure 7.3, parts of a body cease to 
be in causal contact (cease to be able to influence each other) once they are close enough to the singularity. 
From the perspective of an infaller inside the horizon, the closest they ever see any point an angle @ away is 
at the edge of their past light cone, along the J = œœ null geodesic. 


Exercise 7.11. Causal distance between infallers near the singularity. The proper distance between 
two infallers who fall along different radial directions goes to zero at the singularity, but the causal distance 
between the two, the shortest causal path joining them, does not go to zero. The shortest causal path is the 
red line illustrated in the right panel of Figure 7.1, a pair of null geodesics each with the maximum possible 
angular momentum, J = co. A measure of distance along a null geodesic is the affine distance A. Calculate 
the affine distance along the shortest causal path between infallers approaching the singularity. 

Solution. Normalized to a frame at rest at infinity, the affine distance À along a null geodesic is obtained 
by integrating d¢é/d\ = r?/J, equation (7.41b), or equivalently dr/dA = v” from equation (7.42), giving 


1 dr 
rag fra = E 7.65 
J ý y1- PA/r? (e 
Normalized to the frame of an observer, the affine distance Aops is 
obs = WobsA 5 (7.66) 


where Wops is the observed energy (7.44) of the photon relative to that at infinity. The observed affine distance 
to an object coincides with proper distance to it measured by the observer in their immediate locally inertial 
vicinity. The shortest causal path joining infallers near the singularity is realised by a pair of photons emitted 
in opposite directions with maximum angular momentum, J = oo, from a point half way (in angle) between 
the infallers, illustrated by the red line in the right panel of Figure 7.1. The causal path has two symmetrically 
equal parts, each following the path of a cardioid, equation (7.62). If the angular separation between the two 
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infallers near the singularity is 2¢, then the observed affine distance along the shortest causal path is 2A ops, 
twice the affine distance along each individual null segment, 


$ 2 rọ 2 
Wobs WobsT s : WobsT s š : 
obs = —— f r? dọ = J [ sin’($/2) dd = Dn -— ising + $ sin 2¢) . (7.67) 
The ¢dependent factor in parentheses on the right hand side of equations (7.67) is ~ 9 at small sep- 
arations ¢. The observed energy wop, depends on the position and motion of the observer. Radially-falling 
observers (L = 0) near the singularity watching J = oo null geodesics see photon energy, from equation (7.44), 
J 
Wobs = 5 (7.68) 


Tobs 


so the factor on the right hand side of the expression (7.67) for the observed affine distance is 


WobsT2 — r2 (7 69) 
J Tobs i ` 
which diverges at the singularity, robs —> 0. The divergence is a symptom of the failure of general relativity, 
the cessation of the existence of locally inertial frames, at the singularity. Notwithstanding the divergence, 
the robust conclusion is that the causal distance between two infallers does not go to zero at the singular 
surface. 


Exercise 7.12. Maximum transverse velocity of a light signal inside the horizon. Again consider 
two infallers who free-fall radially along radial paths at different angular positions. The maximum transverse 
velocity with which they can send signals to each other is, once again, along J = oo null geodesics. Show 
that this maximum transverse velocity is 


l=, (7.70) 


The maximum transverse velocity is always less than the speed of light, but tends to the speed of light at 
the singularity. 

Solution. The relation between the radius r and angle ¢ along a J = œo null geodesic is given by equa- 
tion (7.62). The relation between radius r and proper time tg for a radial free-faller follows from dr/dtg = 8 
in the Gullstrand-Painlevé metric (7.27). 


7.13 Embedding diagram 


An embedding diagram is a visual aid to understanding geometry. It is a depiction of a lower dimensional 
geometry in a higher dimension. A classic example is the illustration of the geometry of a 2-sphere embedded 
in 3-dimensional space, Figure 2.2. 
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Circumferential radius r 
(units Schwarzschild radii) 


5 


Singularity 


Figure 7.4 Embedding diagram of the Schwarzschild geometry. The 2-dimensional surface represents the 3-dimensional 
Schwarzschild geometry at a fixed instant of Schwarzschild time t. Each circle represents a sphere, of proper circumfer- 
ence 27r, as measured by observers at rest in the geometry. The proper radial distance measured by observers at rest 
is stretched in the radial direction, as shown in the diagram. The stretching is infinite at the horizon, so the spatial 
geometry there looks like a vertical cliff. Radial lines in the Schwarzschild geometry are spacelike outside the horizon, 
but timelike inside the horizon. 


Figure 7.4 shows an embedding diagram of the spatial Schwarzschild geometry at a fixed instant of Schwarz- 
schild time t. To the polar coordinates r, 0, ġ of the 3D Schwarzschild spatial geometry, adjoin a fourth spatial 
coordinate w. The metric of 4D Euclidean space in the coordinates w, r, 0, ¢, is 


dl? = dw? + dr? +r7do? . (7.71) 


The spatial Schwarzschild geometry is represented by a 3D surface embedded in the 4D Euclidean geometry, 
such that the proper distance dl in the radial direction satisfies equation (7.17), that is 


dr? 


dl? 


Equation (7.72) rearranges to 


foe (7.73) 


Jr/rs—1’ 
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w= yZ- (7.74) 


The embedded Schwarzschild surface has the shape of a square root, infinitely steep at the horizon r = rs, 
as illustrated by Figure 7.4. 

Inside the horizon, proper radial distances change to being timelike, dl? < 0, equation (7.17). Here the 
Schwarzschild geometry at fixed Schwarzschild time t (which is a spacelike coordinate inside the horizon) 
can be embedded in a 4D Minkowski space in which the fourth coordinate w is timelike, 


which integrates to 


dl? = — dw? + dr? + r°do’ . (7.75) 


The embedded surface inside the horizon satisfies 


r 
w=- j1- >, (7.76) 


with a minus sign chosen so that the coordinate w is negative inside the horizon, whereas it is positive 
outside the horizon. The two embeddings (7.74) and (7.76) can be patched together at the horizon (though 
not doubly differentiably), as illustrated in Figure 7.4. 

It should be emphasized that the embedding diagram of the Schwarzschild geometry at fixed Schwarzschild 
time t has a limited physical meaning. Fixing the time t means choosing a certain hypersurface through the 
geometry. Other choices of hypersurface will yield different embedding diagrams. For example, the Gullstrand- 
Painlevé metric (7.27) is spatially flat at fixed free-fall time tg, so in that case the embedding diagram would 
simply illustrate flat space, with no funny business at the horizon. 


7.14 Schwarzschild spacetime diagram 


In general relativity as in special relativity, a spacetime diagram is a plot of space versus time. 

Figure 7.5 shows a spacetime diagram of the Schwarzschild geometry. In this diagram, Schwarzschild time 
t increases vertically upward, while circumferential radius r increases horizontally. 

The more or less diagonal lines in Figure 7.5 are outgoing and infalling radial null geodesics. The radial 
null geodesics are not at 45°, as they would be in a special relativistic spacetime diagram. In Schwarzschild 
coordinates, light rays that fall radially (dð = dọ = 0) inward or outward follow null geodesics 


2 Ms\ 442) Tajt a 
ds? = (1 2) at aC =) dr? =0. (7.77) 
Radial null geodesics thus follow 
dr Ta 
Erme 1 = a 
dt ( r ) i or) 


in which the + sign is + for outgoing, — for infalling rays. Equation (7.78) shows that dr/dt > 0 as 
r —> rs, Suggesting that null rays, whether infalling or outgoing, never cross the horizon. In the Schwarzschild 
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Figure 7.5 Spacetime diagram of the Schwarzschild geometry, in Schwarzschild coordinates. The horizontal axis is the 
circumferential radius r, while the vertical axis is Schwarzschild time t. The horizon (pink) is at one Schwarzschild 
radius, r = rs. The singularity (cyan) is at zero radius, r = 0. The more or less diagonal lines (black) are outgoing 
and infalling null geodesics. The outgoing and infalling null geodesics appear not to cross the horizon, but this is an 
artefact of the Schwarzschild coordinate system. 


spacetime diagram 7.5, null geodesics asymptote to the horizon, but never actually cross it. This feature of 
Schwarzschild coordinates was first noticed by Droste (1916), and contributed to the historical misconception 
that black holes stopped at their horizons. The failure of geodesics to cross the horizon is an artefact of 
Schwarzschild’s choice of coordinates, which are adapted to observers at rest, whereas no locally inertial 
frame can remain at rest at the horizon. 


7.15 Gullstrand-Painlevé spacetime diagram 


Figure 7.6 shows a spacetime diagram of the Schwarzschild geometry in Gullstrand-Painlevé coordinates tg 
and r in place of Schwarzschild coordinates t and r. As the spacetime diagram shows, in Gullstrand-Painlevé 
coordinates infalling light rays cross the horizon. Unfortunately, neither Gullstrand nor Painlevé, nor anyone 
else at that time, grasped the physical significance of their metric. 
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Figure 7.6 Gullstrand-Painlevé, or free-fall, spacetime diagram, in units rs = c = 1. In this spacetime diagram the 
time coordinate is the Gullstrand-Painlevé time tg, which is the proper time of observers who free-fall radially from 
zero velocity at infinity. The radial coordinate r is the circumferential radius, and the horizon and singularity are at 
r = rs and r = 0, as in the Schwarzschild spacetime diagram, Figure 7.5. In contrast to the spacetime diagram in 
Schwarzschild coordinates, in Gullstrand-Painlevé coordinates infalling light rays do cross the horizon. 


7.16 Eddington-Finkelstein spacetime diagram 


In 1958, David Finkelstein (1958) carried out an elementary transformation of the time coordinate which 
seemed to show that infalling light rays could indeed pass through the horizon. It turned out that Eddington 
had already discovered the transformation in 1924 (Eddington, 1924), though at that time the physical 
implications were not grasped. Again, it is striking that the mathematics was in place long before physical 
understanding emerged. 

In Schwarzschild coordinates, radially outgoing or infalling light rays follow equation (7.78). Equation (7.78) 
integrates to 


t=+(r+rsln|r—rsl) , (7.79) 


which shows that Schwarzschild time t approaches +00 logarithmically as null rays approach the horizon. 
Finkelstein defined his time coordinate tp by 


tp =t+r,ln|r—r,| , (7.80) 
which has the property that infalling null rays follow 


tp +r = constant . (7.81) 
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Figure 7.7 Finkelstein spacetime diagram, in units rs = c = 1. Here the time coordinate is taken to be the Finkelstein 
time coordinate tp, equation (7.80). The Finkelstein time coordinate tp is constructed so that radially infalling light 
rays are at 45°. 


In other words, on a spacetime diagram in Finkelstein coordinates, Figure 7.7, radially infalling light rays 
move at 45°, the same as in a special relativistic spacetime diagram. 


7.17 Kruskal-Szekeres spacetime diagram 


After Finkelstein had transformed coordinates so that radially infalling light rays moved at 45° in a spacetime 

diagram, it was natural to look for coordinates in which outgoing as well as infalling light rays are at 45°. 

Kruskal and Szekeres independently provided such a transformation in 1960 (Kruskal, 1960; Szekeres, 1960). 
Define the tortoise, or Regge-Wheeler (Regge and Wheeler, 1957), coordinate r* by 


dr 
*= =r+2M1 —2M|. 82 
r e r n|r | (7.82) 


Then radially infalling and outgoing null rays follow 


r*+t=constant infalling , 
(7.83) 
r* —t=constant outgoing . 


In a spacetime diagram in coordinates t and r*, infalling and outgoing light rays are indeed at 45°. Unfor- 
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Figure 7.8 Kruskal-Szekeres spacetime diagram, in units rs = c = 1. Kruskal-Szekeres coordinates are arranged such 
that not only infalling, but also outgoing null rays move at 45° on the spacetime diagram. The Kruskal-Szekeres 
spacetime diagram reveals the causal structure of the Schwarzschild geometry. The singularity (cyan) at r = 0, at 
the upper edge of the spacetime diagram, is revealed to be a spacelike surface. Besides the usual horizon (pink), 
there is an antihorizon (red), which was not apparent in Schwarzschild or Finkelstein coordinates. In the Kruskal- 
Szekeres spacetime diagram, lines of constant circumferential radius r (blue) are hyperboloids, while lines of constant 
Schwarzschild time t¢ (violet) are straight lines passing through the origin, the same as in the spacetime wheel, 
Figure 1.14, or as in Rindler space. Contours of constant Schwarzschild time t (violet) are spaced uniformly at 
intervals of 1 (in units rs = c = 1), and similarly infalling and outgoing null rays (black) are spaced uniformly by 1, 
while lines of constant circumferential radius r (blue) are drawn spaced uniformly by 1/4. 


tunately the metric in these coordinates is still singular at the horizon r = 2M: 
2M 
ds? = (1 = an (— dt? + dr**) +r7do? . (7.84) 
r 


The singularity at the horizon can be eliminated by the following transformation into Kruskal-Szekeres 
coordinates tx and rg: 


4M 


*—t 
rk — tk = +2M exp (a ) ; 


*4+t 
n+ te =2Mexp (7 T ) ; 
(7.85) 


where the + sign in the last equation is + outside the horizon, — inside the horizon. The Kruskal-Szekeres 
metric is 


8M 7 
ds? = a ka 2 dt + dr) + r?do? , (7.86) 
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Figure 7.9 From left to right, the Finkelstein spacetime diagram, Figure 7.7, morphs into the Kruskal-Szekeres space- 
time diagram, Figure 7.8. The morph illustrates how the antihorizon, or past horizon (red), emerges from the depths 
of t = —oo. Like the horizon, the antihorizon is a null surface, thus appearing at 45° in the Kruskal-Szekeres spacetime 
diagram. 


which is non-singular at the horizon. The Schwarzschild radial coordinate r, which appears in the factors 
(8M/r)e-"/?™ and r? in the Kruskal metric, is to be understood as an implicit function of the Kruskal 
coordinates tx and rx. 


7.18 Antihorizon 


The Kruskal-Szekeres spacetime diagram reveals a new feature that was not apparent in Schwarzschild or 
Finkelstein coordinates. Dredged from the depths of t = —oo appears a null line rg + tk = 0, Figure 7.9. 
The null line is at radius r = 2M, but it does not correspond to the horizon that a person might fall into. 
The null line is called the antihorizon. 


7.19 Analytically extended Schwarzschild geometry 


The Schwarzschild geometry is analytic, and there is a unique analytic continuation of the geometry through 
the antihorizon. The extended geometry consists of two copies of the Schwarzschild geometry, glued along 
their antihorizons, as illustrated in the embedding diagram in Figure 7.10. The embedding diagram 7.10 
gives the impression of a static wormhole, but this is an artefact of everything being frozen at the horizon 
in Schwarzschild coordinates. 

Figure 7.11 shows the Kruskal spacetime diagram of the analytically extended Schwarzschild geometry, 
Whereas the original Schwarzschild geometry showed an asymptotically flat region and a black hole region 
separated by a horizon, the complete analytically extended Schwarzschild geometry shows two asymptotically 
flat regions, together with a black hole and a white hole. Relativists typically label the regions I, II, III, and 
IV, but I like to call them by name: “Universe,” “Black Hole,” “Parallel Universe,” and “White Hole.” 

The white hole is a time-reversed version of the black hole. Whereas space falls inward faster than light 
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Figure 7.10 Embedding diagram of the analytically extended Schwarzschild geometry. The analytically extended 
geometry is constructed by gluing together two copies of the Schwarzschild geometry along the antihorizon. The 
extended geometry contains a Universe, a Parallel Universe, a Black Hole, and a White Hole. 


2.0 UN YI Y 
XN 7 

1.5 £ = 
ws Po LA 
By ee f H 
3 L\ \ / J 
S Ti \ / = 
| | 7 1 A 
P .5 À A / a 
Q K | \ \ ` | l LA 
9 | | i \ | | Pa 
.0 = | | ‘ ; | | =i 

= | | | j \ | | 
z = ) a \ = 
D J CY XS \ \ 4 
= / f ; Y À XN . = 
% 10k LPL Gig QR g 
ç P z a F j i e Y h Uy 2 SN N ? N Y 
2 / [I SN : 3 

-1.5 J Wy jor AN 

CY f lg A 60VQa XY KS 

/ 1 AV h td N N<) 


(0) 
3.0 -2.5 -2.0 -1.5 -1.0 —.5 .0 .5 10 1.5 2.0 2.5 3.0 
Kruskal radial coordinate rk 


Figure 7.11 Analytically extended Kruskal-Szekeres spacetime diagram, in units rs = c = 1. The analytically extended 
horizon and antihorizon (crossing pink/red lines at 45°) divide the spacetime into 4 regions, a Universe region at right, 
a Black Hole region bounded by the singularity at top, a Parallel Universe region at left, and a White Hole region 
bounded by a singularity at bottom. The White Hole is a time-reversed version of the Black Hole. 
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inside the black hole, space falls outward faster than light inside the white hole. In the Gullstrand-Painlevé 
metric (7.27), the velocity 6 = +(2M/r)'/? is negative for the black hole, positive for the white hole. 

The Kruskal diagram shows that the universe and the parallel universe are connected, but only by spacelike 
lines. This spacelike connection is called the Einstein-Rosen bridge, and constitutes a wormhole connecting 


the two universes. Because the connection is spacelike, it is impossible for a traveller to pass through this 
wormhole. The wormhole is said to be non-traversable. 

Figure 7.12 illustrates a sequence of embedding diagrams for spatial slices of the analytically extended 
Schwarzschild geometry. Although two travellers, one from the universe and one from the parallel universe, 
cannot travel to each other’s universe, they can meet, but only inside the black hole. Inside the black hole, 
they can talk to each other, and they can see light from each other’s universe. Sadly, the enlightenment is 
only temporary, because they are doomed soon to hit the central singularity. 
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Figure 7.12 Sequence of embedding diagrams of spatial slices of the analytically extended Schwarzschild geometry, 
progressing in time from left to right. Two white holes merge, form an Einstein-Rosen bridge, then fall apart into 
two black holes. The wormhole formed by the Einstein-Rosen bridge is non-traversable. The (yellow) arrows indicate 
the direction in which an object can cross the horizon. At left, travellers in the two universes cannot fall into their 
respective white holes, because objects can cross the white hole horizons (red) only in the outward direction. The 
horizons cross in the middle diagram, without the arrows changing direction. After this point, travellers in the two 
universes can fall through their respective black hole horizons (pink) into the Einstein-Rosen bridge, and temporarily 
meet up with each other. Unfortunately, having fallen through the black hole horizons, they cannot exit, and are 
doomed to hit the singularity. The insets at top show the adopted spatial slicings on the Kruskal spacetime diagram. 
The adopted slicings are engineered to give the embedding diagrams an appealing look, and have no fundamental 
significance. 
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Figure 7.13 Penrose spacetime diagram, in units rs = c = 1. The Penrose coordinates tp and rp here are defined 
by equations (7.87) and (7.88). Lines of constant Schwarzschild time ¢ (violet), and infalling and outgoing null lines 
(black) are spaced uniformly at intervals of 1 (units rs = c = 1), while lines of constant circumferential radius r (blue) 
are spaced uniformly in the tortoise coordinate r*, equation (7.82), so that the intersections of t and r lines are also 
intersections of infalling and outgoing null lines. 


It should be emphasized that the white hole and the wormhole in the Schwarzschild geometry are a 
mathematical construction with as far as anyone knows no relevance to reality. Nevertheless it is intriguing 
that such bizarre objects emerge already in the simplest general relativistic solution for a black hole. 


7.20 Penrose diagrams 


Roger Penrose, as so often, had a novel take on the business of spacetime diagrams (Penrose, 2011). Penrose 
conceived that the primary purpose of a spacetime diagram should be to portray the causal structure of 
the spacetime, and that the specific choice of coordinates was largely irrelevant. After all, general relativity 
allows arbitrary choices of coordinates. 

In addition to requiring that light rays be at 45°, Penrose wanted to bring regions at infinity (in time or 
space) to a finite position on the spacetime diagram, so that the entire spacetime could be seen at once. Such 
diagrams are called Penrose diagrams, or conformal diagrams. 

Penrose diagrams are bona-fide spacetime diagrams. Penrose time and space coordinates tp and rp can 
be defined by any conformal transformation of Kruskal-Szekeres coordinates 


rp tp = f(rk =E tx) (7.87) 
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Figure 7.14 From left to right, the Kruskal-Szekeres spacetime diagram, Figure 7.8, morphs into the Penrose spacetime 
diagram, Figure 7.13. 
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Figure 7.15 Penrose spacetime diagram of the analytically extended Schwarzschild geometry. This is the analytically 
extended version of Figure 7.13. 


for which f(z) is finite as z + too. The transformation (7.87) brings spatial and temporal infinity to finite 
values of the coordinates, while keeping infalling and outgoing light rays at 45° in the spacetime diagram. 
It is common to draw a Penrose diagram with the singularity horizontal, which can be accomplished by 
choosing the function f(z) to be odd, f(—z) = —f(z). Figure 7.13 shows a spacetime diagram in Penrose 
coordinates with f(z) set to 


i= * atanz (7.88) 
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Figure 7.16 Penrose diagram of the Schwarzschild geometry, labelled with the Universe and Black Hole regions, and 
their various boundaries. The (blue) line at less than 45° from vertical is a possible trajectory of a person who falls 
through the horizon from the Universe into the Black Hole. Once inside the horizon, the infaller cannot avoid the 
Singularity. 


Figure 7.14 illustrates a morph of the Kruskal-Szekeres spacetime diagram, Figure 7.8, into the Penrose 
spacetime diagram, Figure 7.13. 
Figure 7.15 illustrates the Penrose diagram of the analytically extended Schwarzschild geometry. 


7.21 Penrose diagrams as guides to spacetime 


In the literature, Penrose diagrams are usually sketched, not calculated, the aim being to convey a conceptual 
understanding of the spacetime without obsessing over detail. 

Figure 7.16 shows a Penrose diagram of the Schwarzschild geometry, with the Universe and Black Hole 
regions, and the various boundaries of the diagram, marked. The 45° edges of the Penrose diagram at infinite 
radius, r = oo, are called past and future null infinity, often designated in the mathematical literature 
by Z, and Z_ (commonly pronounced scri-plus and scri-minus, scri being short for script-I). The corners of 
the Penrose diagram in the infinite past or future are called past and future infinity, often designated i_ 
and i, while the corner at infinite radius is called spatial infinity, often designated io. 

The Schwarzschild geometry, being asymptotically flat (Minkowski), has no boundary at infinity. Thus 
the boundary at infinity in the Penrose diagram is not part of the spacetime manifold. However, a worldline 
that extends into the indefinite past converges towards past infinity, while a worldline that extends into the 
indefinite future outside the black hole converges towards future infinity. 

A Penrose diagram is an indispensable guide to finding your way around a complicated spacetime such 
as a black hole. However, a Penrose diagram can be deceiving, because the conformal mapping distorts 


7.22 Future and past horizons 163 


r=0 


@, Black Hole 
2% 
Q 


Figure 7.17 Penrose diagram of the analytically extended Schwarzschild geometry. 


the spacetime. Most of the physical spacetime in the Penrose diagram of the Schwarzschild geometry is 
compressed to the corners of the diagram, to past, future, and spatial infinity, and to the top left point at 
the intersection of the antihorizon with the singularity. 

Figure 7.17 shows the Penrose diagram of the analytically extended Schwarzschild geometry, with the four 
regions, Universe, Black Hole, Parallel Universe, and White Hole marked. Again, relativists typically call 
these regions I, II, III, and IV, but I like to give them names. I’ve also given names to the various horizons. 
The names are unconventional, but reasonable. 


Concept question 7.13. Penrose diagram of Minkowski space. Draw a Penrose diagram of Minkowski 
space. 


7.22 Future and past horizons 


Hawking and Ellis (1973) define the future horizon of the worldline of an observer to be the boundary of 
the past lightcone of the continuation of the worldline into the indefinite future. Likewise the past horizon 
of the worldline of an observer is the boundary of the future lightcone of the continuation of the worldline 
into the indefinite past. The definition of future and past horizons is observer-dependent. 

The horizon of a Schwarzschild black hole is a future horizon for observers who remain at a finite distance 
outside the black hole for ever. The antihorizon of a Schwarzschild black hole is a past horizon for observers 
who remained a finite distance outside the black hole in the indefinite past. 

The causal diamond of an observer is the part of spacetime bounded by the observer’s past and future 
horizons. The causal diamond is the region of spacetime to which the observer can, at some point on their 
worldline, send a signal, and from which the observer can, at some other point on their worldline, receive a 
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signal. For example, the Universe region of the Penrose diagram 7.16 is the causal diamond of an observer 
who starts at past infinity and ends at future infinity, without falling into the black hole. 


7.23 Oppenheimer-Snyder collapse to a black hole 


Realistic collapse of a star to a black hole is not expected to produce a white hole or parallel universe. 

The simplest model of a collapsing star is a spherical ball of uniform density and zero pressure which free 
falls from zero velocity at infinity, a problem first solved by Oppenheimer and Snyder (1939). In this simple 
model, the interior of the star is described by a collapsing Friedmann-Lemaitre-Robertson-Walker metric 
(see Chapter 10), while the exterior is described by the Schwarzschild solution. The assumption that the star 
collapses from zero velocity at infinity implies that the FLRW geometry is spatially flat, the simplest case. 
To continue the geometry between Schwarzschild and FLRW geometries, it is neatest to use the Gullstrand- 
Painlevé metric, with the Gullstrand-Painlevé infall velocity 8 at the edge of the star set equal to minus r 
times the Hubble parameter of the collapsing FLRW metric, —rH = —r dlna/dt. Section 20.15 describes a 
systematic approach to solving the Oppenheimer-Snyder problem. 

Figure 7.18 shows the star collapse as seen by an outside observer at rest at a radius of 10 Schwarzschild 
radii. The Figure is correctly ray-traced, taking into account the different travel times of light from the 
various parts of the star to the observer. The collapsing star appears to freeze at the horizon, taking on the 
appearance of a Schwarzschild black hole. 

When Oppenheimer & Snyder first did their calculation, the result seemed paradoxical. An outsider saw 
the collapsing star freeze at its horizon and never get further, even to the end of time. Yet an observer who 
collapsed with the star would find themself falling uneventfully through the horizon to the central singularity 
in a finite proper time. How could these two perspectives be reconciled? 

The solution is that the freezing at the horizon is an illusion. As pictured in Figure 7.2, space is falling at 
the speed of light at the horizon. Light emitted outward at the horizon just hangs there, barrelling at the 
speed of light through space that is falling at the speed of light. It takes an infinite time for light to lift off 
the horizon and make it to the outside world. The star really did collapse, but the infinite light travel time 
from the horizon gives the illusion that the star freezes at the horizon. 

That radially outgoing light rays at the horizon remain on the horizon is apparent in the Penrose diagram, 
which shows the horizon as a null line, at 45°. 


7.24 Apparent horizon 


Since light can escape from the surface or interior of the collapsing star as long as it is even slightly larger 
than its Schwarzschild radius, it is possible to take the view that the horizon comes instantaneously into 
being at the moment that the star collapses through its Schwarzschild radius. This definition of the horizon is 
called the apparent horizon. More generally, the apparent horizon is a null surface on which the congruence 
of light rays that form the surface are neither diverging nor converging. In spherically symmetric spacetimes, 
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Figure 7.18 Three frames in the collapse of a uniform density, pressureless, spherical star from zero velocity at infinity 
(Oppenheimer and Snyder, 1939), as seen by an outside observer at rest at a radius of 10 Schwarzschild radii. The 
frames are spaced by 10 units of Schwarzschild time (c = rs = 1). The star is made transparent, so you can see inside. 
Two layers are shown, one at the surface of the star, the other at half its radius. The centre of the star is shown as 
a dot. The frames are accurately ray-traced, and include the effect of the different light travel times from different 
parts of the star to the observer. As time goes by, from left to right, the collapsing star appears to freeze at the 
horizon, taking on the appearance of a Schwarzschild black hole. The different layers of the star appear to merge into 
one. The radius of the nearest point on the surface at the time of emission is 3.72, 1.50, and 1.01 Schwarzschild radii 
respectively. 


an apparent horizon is a place where radially moving null geodesics remain at rest in circumferential radius 


T, 


—=0. (7.89) 


7.25 True horizon 


An alternative definition of the horizon is to take it to be the boundary between outgoing null rays that 
fall into the black hole versus those that go to infinity. In any evolving situation, this definition of the 
horizon, which is called the true horizon, or absolute horizon, depends formally on what happens in the 
indefinite future, but in a slowly evolving system the absolute horizon can be located with some precision 
without knowing the future. The true horizon is part of the future horizon of an observer who remains at a 
finite distance outside the black hole into the indefinite future. 

Figure 7.19 shows Finkelstein, Kruskal, and Penrose spacetime diagrams of the Oppenheimer-Snyder col- 
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lapse of a star to a Schwarzschild black hole. The diagrams show the freely-falling surface of the collapsing 
star, and the formation of the true horizon and of the singularity. The true horizon of the collapsing star 
forms before the star has collapsed, and grows to meet the apparent horizon as the star falls through its 
Schwarzschild radius. The central singularity forms slightly before the star has collapsed to zero radius. The 
formation of the singularity is marked by the fact that light rays emitted at zero radius cease to be able to 
move outward. In other words, the singularity forms when space starts to fall into it faster than light. 


7.26 Penrose diagrams of Oppenheimer-Snyder collapse 


Figure 7.20 shows a sequence of Penrose diagrams of Oppenheimer-Snyder collapse, progressing in time from 
left to right. The diagrams are drawn from the perspective of an observer before collapse on the left, to 
that of an observer after collapse on the right. The diagrams illustrate that, even though a Penrose diagram 
supposedly encompasses all of the spacetime, it crams most of the spacetime into a few boundary points, and 
the appearance of the diagram can vary dramatically depending on what part of the spacetime the diagram 
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Figure 7.19 Finkelstein, Kruskal-Szekeres, and Penrose spacetime diagrams of the Oppenheimer-Snyder of a pressure- 
less, spherical star. The thick (red) line is the surface of the collapsing star. The geometry outside the surface of the 
star is Schwarzschild, and the spacetime diagrams there look like those shown previously, Figures 7.7, 7.8, and 7.13. 
The geometry inside the surface of the star is that of a uniform density, pressureless Friedmann-Lemaitre-Robertson- 
Walker universe. The lines of constant time (purple) are lines of constant Schwarzschild time outside the star’s surface, 
and lines of constant FLRW time inside the star’s surface. Lines of constant circumferential radius r (blue) are spaced 
uniformly in the tortoise coordinate r*, equation (7.82), so before collapse appear bunched around the radius r = rs 
that after collapse becomes the horizon radius. The thick (pink) line at 45° in the Kruskal and Penrose diagrams is the 
true, or absolute, horizon, which divides the spacetime into a region where light rays are trapped, eventually falling 
to zero radius, and a region where light rays can escape to infinity. A singularity (cyan) forms when outgoing light 
rays can no longer escape from zero radius, which happens slightly before the surface of the collapsing star reaches 
zero radius. 
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centres. In Figure 7.20, the Penrose diagram looks like Minkowski well before collapse, and like Schwarzschild 
well after collapse. 

The Penrose diagrams in Figure 7.20 are drawn in the Penrose coordinates defined by equations (7.87) 
with the function f(z) given by equation (7.88). Requiring the singularity to be horizontal, as is conventional, 
imposes that f(z) be odd. Since other choices of f(z) could be made, the shapes of the Penrose diagrams 
are not unique. However, other choices of smooth, monotonic, odd f(z) give diagrams quite similar to those 
shown. In particular, as long as the singularity is chosen to be horizontal, it is impossible to arrange that 
the left edge of the diagram, defined by the centre of the collapsing star at r = 0, be vertical. 

In the evolving Penrose diagram of Figure 7.20, spacetime appears to flow out of future infinity, the point 
at the top right of the diagram, down into past infinity, the point at the bottom of the diagram. Inside the 
horizon, as Schwarzschild time t goes by, spacetime appears to flow to the left, to the top left corner of the 
spacetime diagram. An infaller inside the horizon must of course follow a worldline at less than 45° from 
vertical. However, infallers who fall in at different times fall to different places on the spacelike singularity. 
From the perspective of an outside observer, infallers who fell in long ago are crammed to the top left corner 
of the Penrose diagram. 


7.27 Illusory horizon 


The simple Oppenheimer-Snyder model of stellar collapse shows that the antihorizon of the complete Schwarz- 
schild geometry is replaced by the surface of the collapsing star, and that beyond the star’s surface is not 
a parallel universe and a white hole, but merely the interior of the star, and the distant Universe glimpsed 
through the star’s interior. 

As time goes by, the surface of the collapsing star becomes dimmer and more redshifted, taking on the 
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Figure 7.20 Sequence of Penrose diagrams illustrating the Oppenheimer-Snyder collapse of a pressureless, spherical 
star to a Schwarzschild black hole, progressing in time of collapse from left to right. On the left, the collapse is to 
the future of an observer at the centre of the diagram; on the right, the collapse is to the past of an observer at the 
centre of the diagram. The diagrams are at times —16, —4, 0, 4, and 16 Schwarzschild time units (c = rs = 1) relative 
to the middle diagram. On the left the Penrose diagram resembles that of Minkowski space, while on the right the 
diagram resembles that of the Schwarzschild geometry. These Penrose diagrams are spacetime diagrams calculated in 
the Penrose coordinates defined by equations (7.87) and (7.88). 
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appearance of the Schwarzschild antihorizon, Figure 7.18. The name illusory horizon for the exponentially 
dimming and redshifting surface was coined by Hamilton and Polhemus (2010). Figure 7.21 shows a Penrose 
diagram of a spherical collapsed star, with the true and illusory horizons marked. The Penrose diagram is 
just the limit of the sequence of the diagrams in Figure 7.20 from the perspective of an observer for whom the 
star collapsed long ago. The Penrose diagram 7.21 looks identical to the Penrose diagram of a Schwarzschild 
black hole, Figure 7.13, except that the antihorizon is replaced by the illusory horizon. 

Unlike the antihorizon, the illusory horizon is not a future or past horizon, as defined by Hawking and Ellis 
(1973). As the Penrose diagrams 7.20 show, the illusory horizon is neither the boundary of the past lightcone 
of the future development of the worldline of any observer, nor the boundary of the future lightcone of the 
past development of the worldline of any observer. 

An object similar to the illusory horizon, the stretched horizon, was introduced by Susskind, Thorlacius, 
and Uglum (1993). The stretched horizon was conceived as the place where, from the perspective of an outside 
observer, Hawking radiation comes from, and the place where, from the perspective of an outside observer, 
the interior quantum states of a black hole reside. The stretched horizon was argued to be located on a 
spacelike surface one Planck area above the true horizon. However, the restriction to an outside observer is 
too limiting, and the notion that the stretched horizon lives literally just above the true horizon has been 
a source of confusion in the theoretical physics literature. If you go down to the true horizon, you do not 
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Figure 7.21 Penrose diagram of a collapsed spherical star at late times. The Penrose diagram looks essentially identical 
to the Penrose diagram 7.16 of the Schwarzschild geometry, except that the antihorizon is replaced by the illusory 
horizon. The wiggly lines show the paths of outgoing light rays from the illusory horizon, and ingoing light rays from 
the true horizon, as seen by an infaller who falls through the true horizon. An infaller looking directly towards the 
black hole sees the illusory horizon ahead of them, whether they are outside or inside the true horizon. The true 
horizon becomes visible to an infaller only after they have fallen through it. Once inside, the infaller sees the true 
horizon behind them, in the direction away from the black hole. 


7.27 Illusory horizon 


Figure 7.22 Six frames from a visualization of the view seen by an observer who free-falls into a Schwarzschild black 
hole. The infaller is on a geodesic with energy per unit mass E = 1, and angular momentum per unit mass L = 1.96 rs. 
From left to right and top to bottom, the observer is at radii 3.008, 1.501, 0.987, 0.508, 0.102, and 0.0132 Schwarzschild 
radii. The illusory horizon is painted with a dark red grid, while the true horizon is painted with a grid coloured with 
an appropriately red- or blue-shifted blackbody colour. The schematic map at the lower left of each frame shows 
the trajectory (white line) of the observer through regions of stable circular orbits (green), unstable circular orbits 
(yellow), no circular orbits (orange), the horizon (red line), and inside the horizon (red). The clock at the lower right 
of each frame shows the proper time left to hit the singularity, in seconds, scaled to the mass 4 x 10° Mo of the Milky 
Way’s supermassive black hole (Ghez et al., 2005; Eisenhauer et al., 2005). The background is Axel Mellinger’s Milky 
Way (Mellinger, 2009) (with permission). 
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encounter the putative stretched horizon. The stretched horizon is an illusion, a mirage. Better call it the 
illusory horizon. 

Figure 7.22 shows six frames from a visualization (Hamilton and Polhemus, 2010) of the appearance of a 
Schwarzschild black hole and its true and illusory horizons as perceived by an observer who free-falls through 
the true horizon. The illusory horizon, the exponentially redshifting image of the long-ago collapsed star, is 
painted with a dark red grid, as befits its dimmed, redshifted appearance. The true horizon is painted with 
blackbody colours blueshifted or redshifted according to the shift that the infalling observer would see on 
an emitter free-falling radially through the true horizon from zero velocity at infinity. When an infaller falls 
through the true horizon, they do not catch up with the illusory horizon, the image of the collapsed star, 
which remains ahead of them. The visualization gives the impression that the illusory horizon is a finite 
distance ahead of the infaller, and this impression is correct: the affine distance between the illusory horizon 
and an infaller at the true horizon is finite, not zero. Calculation of what an infaller sees involves working in 
the locally inertial frame (tetrad) of the infaller, so is deferred until after tetrads. 

An infaller does not encounter the illusory horizon at the true horizon, but, as illustrated by the visual- 
ization 7.22, they do have the impression of encountering the illusory horizon at the singularity. The affine 
distance between the infaller and the illusory horizon tends to zero at the singularity. 


7.28 Collapse of a shell of matter on to a black hole 


The antihorizon of a Schwarzschild black hole is located at the horizon radius, one Schwarzschild radius. 
Where is the illusory horizon located? From the perspective of an observer watching a spherical black hole 
that collapsed from a star long ago, the illusory horizon appears to be located at (exponentially close to) the 
antihorizon of the Schwarzschild black hole of the same mass. 

What happens to the illusory horizon if the black hole accretes mass, and grows larger? Figure 7.23 
shows three frames in the collapse of a thin spherical shell of pressureless matter on to a pre-existing black 
hole, Exercise 20.6. The shell collapses from zero velocity at infinity. As usual in this book, the frames are 
accurately ray-traced. The shell of matter here has the same mass as the pre-existing black hole, so the 
black hole doubles in mass as the shell collapses on to it. The visualization shows that the illusory horizon of 
the pre-existing black hole expands to meet the infalling shell of matter. The apparent expansion is caused 
by gravitational lensing of the pre-existing black hole by the shell. As time goes by, the shell appears to 
merge with the horizon of the pre-existing black hole. The merged shell and expanded horizon take on the 
appearance of the antihorizon of a Schwarzschild black hole of twice the original mass. 

Figure 7.24 shows a Finkelstein spacetime diagram of the collapse of the shell of matter on to the black 
hole. The initial black hole has half the mass of the final black hole. The initial apparent horizon at 0.5r,, 
half the Schwarzschild radius of the final black hole, follows a null geodesic until the infalling shell hits it. 
The shell deflects the null geodesic, which falls to the central singularity. The true horizon follows a null 
geodesic that joins continuously with the apparent horizon of the final black hole. 
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Figure 7.23 Three frames in the collapse of a thin spherical shell of matter on to a pre-existing Schwarzschild black 
hole, as seen by an outside observer at rest at a radius of 10 Schwarzschild radii (Schwarzschild radius of the final 
black hole). The frames are spaced by 10 units of Schwarzschild time (c = rs = 1). The shell has the same mass as 
the original black hole, so the black hole doubles in mass from beginning to end. During the collapse, the horizon of 
the pre-existing black hole appears to expand outward, in due course reaching the size of the new black hole. The 
expansion of the image of the pre-existing black hole is caused by gravitational lensing by the shell. 


Concept question 7.14. Penrose diagram of a thin spherical shell collapsing on to a Schwarz- 
schild black hole. Sketch a Penrose diagram of a thin spherical shell collapsing on to a pre-existing Schwarz- 
schild black hole. Where are the apparent and true horizons? Answer. The Penrose diagram looks essentially 
the same as Figure 7.13 (differing in that lines of constant time and radius are different inside the shell). 
The apparent horizon before collapse follows an outgoing null (45°) line that hits the singularity inside the 
true horizon, consistent with the Finkelstein diagram 7.24. 


7.29 The illusory horizon and black hole thermodynamics 


As will be discussed later in this book, the illusory horizon plays a central role in the thermodynamics of 
black holes. The illusory horizon is the source of Hawking radiation, for observers both outside and inside 
the true horizon. If, as proposed by Susskind, Thorlacius, and Uglum (1993), there is a holographic mapping 
between the interior quantum states of a black hole and its horizon, then that holographic mapping must be 
to the illusory horizon, for observers both outside and inside the true horizon. 
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Figure 7.24 Finkelstein spacetime diagram of a thin spherical shell of matter collapsing on to a pre-existing Schwarz- 
schild black hole, in units rs of the Schwarzschild radius of the final black hole. The mass of the (red) shell equals that 
of the pre-existing black hole, so the black hole doubles in mass as a result of accreting the shell. Whereas the apparent 
horizon jumps discontinuously from 0.5rs to Irs at the shell boundary, the true horizon increases continuously. The 
mathematics governing a thin spherical shell is addressed in Exercise 20.6. 


7.30 Rindler space and Rindler horizons 


Rindler space is Minkowski space expressed in the coordinates of, and as experienced by, a system of uniformly 
accelerating observers, called Rindler observers. A Rindler observer who accelerates uniformly in their own 
frame with proper acceleration 1/1, passing through position {t, x} = {0,1}, follows a worldline in Minkowski 
space 


{t,£} = l {sinh a, cosh a} (7.90) 


with fixed l and varying a. The Rindler observer’s worldline follows a point on the rim of the rotating space- 
time wheel, §1.8.2. The Rindler line-element is the Minkowski line-element expressed in Rindler coordinates 
{a, l, y, z}, Exercise 2.10, 


ds? = — da? + dl? + dy? + dz? . (7.91) 


Despite the fact that Rindler spacetime is Minkowski spacetime in disguise, it nevertheless resembles Schwarz- 
schild spacetime in that, from the perspective of Rindler observers, Rindler space contains horizons. Moreover 
Rindler observers are expected to see Hawking radiation, which in this context is called Unruh (1976) radi- 
ation. 

Figure 7.25 shows a Rindler diagram, a spacetime time diagram of Minkowski space, drawn in standard 
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Figure 7.25 Rindler diagram, which is a Minkowski spacetime diagram showing lines of constant Rindler coordinates 
a and l, equations (7.90) and (7.92). The Rindler lines are uniformly spaced by 0.2 in œ and Inl. The spacetime 
diagram resembles that of the analytically extended Schwarzschild geometry in Kruskal coordinates, Figure 7.11. The 
null lines passing through the origin constitute future (line from lower left to upper right) and past (line from lower 
right to upper left) horizons for Rindler observers in the right quadrant. 


Minkowski coordinates t and x, showing lines of constant Rindler coordinates a@ and l. The Rindler spacelike 
coordinate l is positive in the right quadrant, negative in the left quadrant. The Rindler coordinate vanishes, 
l = 0, at the boundaries of the right and left quadrants, which form the null lines at 45° passing through 
the origin in the Rindler diagram 7.25. The Rindler metric (7.91) has a coordinate singularity at l = 0. In 
the upper and lower quadrants, the Rindler coordinate | switches from being spacelike to timelike (dl? < 0). 
Rindler coordinates in the upper and lower quadrants are defined by 


{t,x} = l{cosha, sinha} , (7.92) 


where the timelike coordinate l is positive in the upper quadrant, negative in the lower quadrant. 

The null (45°) lines passing through the origin in Figure 7.25 are future and past horizons for Rindler 
observers in the right quadrant of the Rindler diagram. A Rindler observer following a worldline (7.90) in 
the right quadrant never gets to see the part of spacetime to the future of the null surface x = t, which 
therefore constitutes a future horizon for the Rindler observer. The same Rindler observer can never send a 
signal into the part of spacetime to the past of the null surface x = —t, which therefore constitutes a past 
horizon, an antihorizon, for the Rindler observer. 

The Rindler diagram 7.25 resembles the Kruskal diagram 7.11 of the analytically extended Schwarzschild 
geometry, albeit without singularities. The Minkowski coordinates t and x are analogues of the Kruskal 
coordinates tx and rx, while the Rindler coordinates a and / are analogues of the Schwarzschild coordinates 
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t and r. The Schwarzschild and Rindler time coordinates t and a are both Killing coordinates, §7.32. Lines 
of constant Schwarzschild and Rindler time t and a follow straight lines in the corresponding Kruskal and 
Rindler diagrams, Figures 7.11 and 7.25. The Schwarzschild and Rindler spatial coordinates r and / are 
spacelike in the right and left quadrants, timelike in the upper and lower quadrants. 


7.30.1 Penrose diagram of Rindler space 


Figure 7.26 is a Penrose diagram of Rindler space. This is just a Penrose diagram of Minkowski space showing 
lines of constant Rindler coordinates œ and l. Penrose time and space coordinates tp and xp can be defined 
by any conformal transformation 


tp C-Lp = f(t T x) (7.93) 


for which f(z) is finite at z — -too. The Rindler lines acquire a symmetrical appearance on the Penrose 
diagram provided that the conformal function f(z) is chosen to satisfy f(z) + f(—z) = constant. For the 
Penrose diagram in Figure 7.26, the conformal function f(z) is 
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Figure 7.26 Penrose diagram of Rindler space. This is the Penrose diagram of Minkowski space corresponding to 
the Rindler diagram 7.25. The Penrose coordinates tp and xp are related to Minkowski coordinates t and x by 
equations (7.93). The Rindler lines are uniformly spaced by 0.4 in a and lnl. The Penrose diagram resembles that of 
the analytically extended Schwarzschild geometry, Figure 7.15, but without singularities. 
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The choice (7.94) is inspired by the form (10.181) of the coordinates that gives the Penrose diagram of de 
Sitter space a symmetrical appearance. The Penrose diagram 7.26 resembles that of the analytically extended 
Schwarzschild geometry, Figure 7.15, but without singularities. 


Concept question 7.15. Spherical Rindler space. The Rindler line-element (7.91) is plane-parallel, 
with all the Rindler observers accelerating in the x-direction. Would not a better analogue of a spherical 
black hole be the spherically symmetric Rindler line-element 


ds? = — rẹ da? + dr, + r°do? , (7.95) 


where all Rindler observers accelerate in the radial direction with {t,r} = rrR{sinh a, cosha}? Answer. The 
spherical Rindler line-element (7.95) is indeed a viable line-element. However, it does not provide a better 
analogue of a spherical black hole because the past and future horizons of a Rindler observer accelerating 
in, say, the x-direction are flat surfaces at x + t = 0, not spherical surfaces at r +t = 0. 


Time t 
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1.0 Wd 
O 5 10 15 20 25 3.0 3.5 4.0 
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Figure 7.27 Minkowski spacetime diagram, showing worldlines of observers who start at rest, then begin accelerating 
uniformly, as Rindler observers, at t = 0. At t < 0, the lines are lines of constant Minkowski time and space t and z, 
while at t > 0, the lines are lines of constant Rindler time and space a and l, equations (7.90) and (7.92). The Rindler 
lines are uniformly spaced by 0.2 in a and lnl. The null line starting at the origin {t,x} = {0,0} extending upward 
at 45° from vertical is a future horizon for the Rindler observers. 
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7.31 Rindler observers who start at rest, then accelerate 


Rindler space provides an analogue of the analytically extended Schwarzschild geometry. But a spherical 
black hole formed from the collapse of a star is not described by the analytically extended geometry. Rather, 
the analytic extension through the antihorizon is replaced by the interior of the collapsed star. 


A Rindler analogue of a black hole that forms from the collapse of a star is obtained by considering a 
system of Rindler observers who are initially at rest, and begin accelerating only at some time t = 0. The 
situation is illustrated in the spacetime diagram shown in Figure 7.27. This diagram is similar to the Rindler 
diagram 7.25, except that the Rindler observers start accelerating at t = 0 instead of having been accelerating 
into the indefinite past. Just as a black hole formed from the collapse of a star has a future horizon but no 
past horizon, so also the Rindler space of Rindler observers who start at rest contains a future horizon but 
no past horizon. 


Despite having no past horizon, a Rindler observer who starts from rest sees an illusory horizon form, 
Figure 7.28, in much the same way that an observer watching a star collapse to a black hole sees an illu- 
sory horizon form, Figure 7.18. The illusory horizon is the exponentially dimming and redshifting image of 
Minkowski space around the Rindler observer. Figure 7.28 shows three frames in the appearance of a portion 
of Minkowski space as seen by an Rindler observer watching rearward. As time goes by, Minkowski space 
appears to compress and freeze toward a surface, the illusory horizon. The Rindler observer sees the illu- 
sory horizon dim and redshift exponentially. Exercise 7.16 quantifies the appearance of the Rindler illusory 
horizon, which forms a hyperbola around the Rindler observer, with the Rindler observer at its focus. 


Figure 7.28 Three frames in the appearance of Minkowski space as seen by a uniformly accelerating observer, a Rindler 
observer. Minkowski space is represented by a unit box at rest, centred at the origin. The box is drawn asa5x5x5 
lattice. The Rindler observer starts at rest at unit distance from the origin, and watches rearward while accelerating 
at unit acceleration away from the box. The field of view is 120° across the horizontal. The frames increase in time 
from left to right, and are at 0, 2, and 4 units of proper time after the Rindler observer begins accelerating. As time 
goes by, the lattice appears to freeze towards a two-dimensional surface, the illusory Rindler horizon. 
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7.31.1 Penrose diagram of Rindler observers who start at rest, then accelerate 


Figure 7.29 shows a sequence of Penrose diagrams drawn from the perspective of Rindler observers who start 
at rest and begin to accelerate at time t = 0, as in the spacetime diagram 7.27. These Penrose diagrams are 
calculated, not sketched, with Penrose coordinates given by equations (7.93). The left edge of each diagram 
is the surface at x = 0. This sequence resembles the sequence of Penrose diagrams of Oppenheimer-Snyder 
collapse of a star to a black hole, Figure 7.20, except that there is no singularity. 


At left, before the observers start to accelerate, the Penrose diagram looks like that of Minkowski space. 
The Rindler portion of the spacetime (the part above the green line) is crammed along the top right edge of 
the Penrose diagram. At right, after the Rindler observers have started to accelerate, the Penrose diagram 
is tilted by the Lorentz boost of the Minkowski space. The Minkowski portion of the spacetime (the part 
below the green line) crams towards the bottom right edge of the diagram. 


Aren’t the Penrose diagrams in Figure 7.29 misleading because they omit the spacetime to the left of 
the diagrams, at x < 0? Since Rindler observers are confined to the right quadrant of Rindler space, they 
never get to see the region beyond their future horizon. Therefore there is no loss of generality to draw 
the Minkowski spacetime diagram 7.27 with reflection symmetry about x = 0. Applied to the Penrose 
diagrams 7.29, reflection symmetry means that light that passes that passes from x < 0 to x > 0 can be 
considered to “bounce” at 45° off the left edge of the diagram at x = 0. Whatever the case, as seen in 


Figure 7.29 Sequence of Penrose diagrams of the Minkowski space shown in Figure 7.27, progressing in time from left 
to right. The left edge of each diagram is the surface at « = 0. The diagrams at left are in the frames of observers 
who are at rest relative to each other. In the middle diagram, the observers start to accelerate as Rindler observers. 
The diagrams at right are in the frames of the Rindler observers, which become progressively more Lorentz boosted 
compared to the rest frame. The diagrams are at times —8, —2, 0, 2, and 8 units of proper time of the observer who is 
initially at rest at unit distance ( = 1 in Figure 7.27) from the origin. The Rindler lines are uniformly spaced by 0.4 
in a and Inl. This sequence of Penrose diagrams resembles that of the Oppenheimer-Snyder collapse of a star shown 
in Figure 7.20. 
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Exercise 7.16, light emitted from x < 0 appears to a Rindler observer asymptotically to dim, redshift, and 
freeze at the observer’s illusory horizon. 


Exercise 7.16. Rindler illusory horizon. The purpose of this problem is to figure out the appearance 
to a Rindler observer of their illusory horizon. For simplicity, choose time units such that the Rindler 
observer accelerates with unit acceleration. The coordinates {x,y,z} are spatial coordinates in Minkowski 
space. Starting from rest on the z-axis at position x = 1, the Rindler observer accelerates in the positive 
x-direction, reaching position x = po in the rest frame. After a sufficiently long Rindler proper time a, the 
position zo = cosh a is large. 
1. Shape. Show that points {x,y,z} that are close to the origin, in the sense of satisfying |x| < zo and 
\/y? + 22 < xo, appear to a Rindler observer to freeze towards a time-independent surface {1, y, z}, the 
illusory horizon, satisfying 


l= 4(y? +2- 1). (7.96) 


The Rindler observer sees their illusory horizon as a parabola with themself at the focus, the origin. 

2. Redshift. Show further that the Rindler observer sees points on the illusory horizon redshifting expo- 

nentially, at rate e“. 
Solution. 

1. Shape. In the Minkowski rest frame, a spatial point {x,y,z} relative to an observer at {29,0,0} is at 
position {x—zo, y, z}. If the observer is moving at velocity v in the x-direction, then according to the 
rules of 4-dimensional perspective, §1.13.2, the point appears in the observer’s frame to lie at position 
{l, y, z} with transverse coordinates y, z unchanged, and l given by 


l= q(x — 29) + yu (ae — 20)? + y? + 2? , (7.97) 


where y = 1/1 — v? is the Lorentz gamma factor. Points near the origin, with |x| < £o, are behind the 
observer, satisfying x — zo < 0. Thus equation (7.97) factors to 


l= (x — 20) [1 vy/1F y+ 2)/(e—20)| , (7.98) 


which rearranges to 


(a = xo)[1 — v? — v? (y? + 2*)/(x — 20)" 


l= 
1+ 02/1 + (V? + 2)/(@ = a0)? 
= : z—ao v+) 
= aye | y =a | (7.99) 


For a Rindler observer, the position zo is just equal to the Lorentz gamma factor, zo = cosha = y. 
Under the conditions zo = y > 1, along with zo > |x| and zo > yy? + z?, equation (7.99) reduces to 


lx- itil +27), (7.100) 


yielding equation (7.96) as claimed. 
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2. Redshift. According to the rules of 4-dimensional perspective, §1.13.2, the redshift factor, the ratio 
Eem/Eops of emitted to observed photon energies from a point, equals the ratio of the emitted to 
observed distances to the point, 


Eom _ V(2- a)? +y +2 

Lops Jl? + y? + 2? 
A point {l,y,z} on the Rindler observer’s illusory horizon appears fixed to the observer, l satisfying 
equation (7.100). The only quantity on the right hand side of equation (7.101) that various with the 


Rindler observer’s time a is xo. Under the conditions zo = cosha > 1, along with zo >> |z| and 
zo >> Vy? + z?, the redshift factor satisfies 


Eem 
obs 


(7.101) 


Xa Ke. (7.102) 


The redshift factor of a point on the Rindler observer’s illusory horizon thus increases exponentially 
with Rindler time a. 


Exercise 7.17. Area of the Rindler horizon. What is the area of a Rindler observer’s horizon? 
Solution. The area of the Rindler horizon is the area of the spatial y-z plane orthogonal to the Rindler 
observer’s boost plane t-z. For a Rindler observer who starts accelerating at a finite time, the illusory horizon 
after a acceleration times is well-formed only over a region of size \/y? + 2? Ș e7” about the origin. Thus the 
area of the illusory Rindler horizon is of order ~ e?°. 


7.32 Killing vectors 


The Schwarzschild metric presents an opportunity to introduce the concept of Killing vectors (after Wil- 
helm Killing, not because the vectors kill things, though the latter is true), which are associated with 
symmetries of the spacetime. The flow through spacetime of the Killing vectors associate with a symmetry 
is called the Killing vector field. A coordinate that is constant along the flow lines of a Killing vector field 
is called a Killing coordinate. 


7.32.1 Time translation symmetry 


The time translation invariance of the Schwarzschild geometry is evident from the fact that the metric is 
independent of the Schwarzschild time coordinate t. Equivalently, the partial time derivative 0/0t of the 
Schwarzschild metric is zero. The associated Killing vector é” at each point of the spacetime is then defined 
by 


E = (7.103) 
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so that in Schwarzschild coordinates {t,r, 0, é} 

&* = {1,0,0,0} . (7.104) 
In coordinate-independent notation, the Killing vector is 

E= e, =e, . (7.105) 


The Schwarzschild time coordinate t is a Killing coordinate. 

This may seem like overkill — couldn’t one just say that the metric is independent of time t and be done 
with it? The answer is that symmetries are not always evident from the metric, as will be seen in the next 
section 7.32.2. 

Because the Killing vector e; is the unique timelike Killing vector of the Schwarzschild geometry, it has 
a definite meaning independent of the coordinate system. It follows that its scalar product with itself is a 
coordinate-independent scalar 

Enb" = €¢- Ct = gu = — (1 = =) ; (7.106) 


r 


In curved spacetimes, it is important to be able to identify scalars, which have a physical meaning independent 
of the choice of coordinates. 


7.32.2 Spherical symmetry 


The azimuthal rotational symmetry of the Schwarzschild metric is evident from the fact that the metric is 
independent of the azimuthal coordinate ¢, implying that ¢ is a Killing coordinate. The associated Killing 


Figure 7.30 The Killing vector field associated with rotation of a 2-sphere about an axis. 
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vector at each point of the spacetime is 
ep (7.107) 


with components {0,0,0,1} in Schwarzschild coordinates {t, r, 6, 6}. Figure 7.30 illustrates the Killing vector 
field corresponding to the azimuthal rotational symmetry. 

The Schwarzschild metric is fully spherically symmetric, not just azimuthally symmetric. Since the 3D 
rotation group O(3) is 3-dimensional, it is to be expected that there are three Killing vectors. You may 
recognize from quantum mechanics that 0/0¢ is (modulo factors of i and A) the z-component of the angular 
momentum operator L = {L,, L}, Lz} in a coordinate system where the azimuthal axis is the z-axis. The 3 
components of the angular momentum operator are given by: 


o a) o o 
iL, = Ya. <a = —sing 0 cot 0 cos d Be’ (7.108a) 
o ð o a) 
iLy nid "9z = 608 b a, eke sine ; (7.108b) 
o ð ð 
Le =r -y =L. 7.108 
te = Ta Von Op (anae) 
The 3 rotational Killing vectors are correspondingly: 
rotation about z-axis: — sin ¢ eọ — cot 0 cos dey, , (7.109a) 
rotation about y-axis: cos eg — cot Osin eg , (7.109b) 
rotation about z-axis: ey . (7.109c) 


The 3 Killing vectors span the 2-dimensional surface of the unit sphere, and are therefore not linearly 
independent. Specifically, they satisfy 


th, +yLy+2zb,=0. (7.110) 


Note that although a linear combination of Killing vectors with constant coefficients is a Killing vector, a 
linear combination with non-constant coefficients is not necessarily a Killing vector. 

You can check that the action of the x and y rotational Killing vectors on the metric does not kill the 
metric. For example, iLa ggo = 2r? cos ¢sin 6 cos 6 does not vanish. This example shows that a more powerful 
and general condition, described in the next section 7.32.3, is needed to establish whether a quantity is or is 
not a Killing vector. 

Because spherical symmetry does not define a unique azimuthal axis e,, its scalar product with itself 
Ep ` €p = Jop = —r? sin?ð is not a coordinate-invariant scalar. However, the sum of the scalar products of 
the 3 rotational Killing vectors is rotationally invariant, and is therefore a coordinate-invariant scalar 


(— sin ¢ eg — cot 9 cos dey)? + (cos eo — cot Asin Peg)” + €% = goo + (cot?O + 1)ggg =—2r?. (7.111) 


This shows that the circumferential radius r is a scalar, as you would expect. 
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7.32.3 Killing equation 


As seen in the previous section, a Killing vector does not always kill the metric in a given coordinate system. 
This is not really surprising given the arbitrariness of coordinates in general relativity. What is true is that 
a quantity is a Killing vector if and only if there exists a coordinate system (possibly in patches) such that 
the Killing vector kills the metric in that system. 

Suppose that in some coordinate system the metric is independent of the coordinate ¢. Then the covariant 
-momentum py of a particle along a geodesic is a constant of motion, equation (4.50), 


Po = constant . (7.112) 
Equivalently 

E” py = constant. , (7.113) 
where €” is the associated Killing vector, whose only non-zero component is £ = 1 in this particular 


coordinate system. The converse is also true: if €”p, = constant along all geodesics, then €” is a Killing 
vector. The constancy of €’p, along all geodesics is equivalent to the condition that its affine derivative 
vanish along all geodesics 


dE” py 
=0. 7.114 
But this is equivalent to 
0=p"D,(E"p,) = pp’ D ê, = App’ (Due. + D£) , (7.115) 


the ° atop D, serving as a reminder that this is the torsion-free covariant derivative, §2.12. The second 
equality of equations (7.115) follows from the geodesic equation, p" D pu = 0, and the last equality is true 
because of the symmetry of pp” in u + v. A necessary and sufficient condition for equation (7.115) to be 
true for all geodesics is that 


Dut) = 9), (7.116) 


which is Killing’s equation. This equation is the desired necessary and sufficient condition for £” to be 
a Killing vector. It is a generally covariant equation, valid in any coordinate system. Equation (7.116) can 
also be written as the statement that the Lie derivative of the metric, equation (7.154), along the Killing 
direction €” vanishes, 


Ligur =0. (7.117) 


7.32.4 Conformal Killing vector 


Sometimes a spacetime has a weaker conformal symmetry in which, instead of the metric being indepen- 
dent of a coordinate (in some system of coordinates), the metric depends on a coordinate ¢ only through an 
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overall scaling, guv X e??, equation (4.53). In that case the covariant momentum pe is constant only along 
null geodesics, equation (4.56), 


pe = constant along null geodesics . (7.118) 


The associated conformal Killing vector £”, satisfying equation (7.113), is the vector whose only non-zero 
component is ¿° = 1 in a coordinate system where ¢ is one of the coordinates. Equation (7.115) is modified 
to 


0= pp’ (DuEn = Too Dye”) ) (7.119) 


which holds because pp” g,,, = 0 for null geodesics. A necessary and sufficient for equation (7.119) to hold 
for all null geodesics is the conformal Killing equation 


Dtv) = Igu a =0, (7.120) 


the left hand side of which is the trace-free part of Diyév)- The factor of | in equations (7.119) and (7.120) 
is for 4 spacetime dimensions (where g” guv = i); the factor should be replaced by 1/N in N spacetime 
dimensions. 


7.33 Killing tensors 


Some symmetries are expressed by Killing tensors ¢"” rather than Killing vectors. Whereas for a Killing 
vector, “p, is a constant of motion along geodesics, equation (7.113), for a Killing tensor 


E” pupu = constant . (7.121) 


A Killing tensor €"” is symmetric without loss of generality. The metric g,» is itself a Killing tensor in any 
spacetime, since 


g!” P pu = —m? = constant . (7.122) 
The condition of the constancy of ¿”p py along geodesics is equivalent to the condition that its affine 


derivative vanishes along all geodesics, analogously to equation (7.114). A necessary and sufficient condition 
for this to be true is Killing’s equation 


DoE uv) =0 , (7.123) 


where the parentheses denote symmetrization over all indices. 
A conformal Killing tensor is one that satisfies equation (7.121) only along null geodesics. The corre- 
sponding Killing equation is the trace-free part of equation (7.123). 
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7.34 Lie derivative 


It was remarked above that Killing’s equation (7.116) can be recast as the statement that the Lie derivative 
of the metric along the Killing vector vanishes, equation (7.117). This section presents an exposition of the 
Lie derivative. 

The Lie derivative of a coordinate tensor, whose mathematical form is derived in §§7.34.2—7.34.6, is 
physically minus the rate of change of the coordinate tensor with respect to a prescribed change in the 
coordinates, equation (7.124). The change in coordinates should be understood as leaving the spacetime 
itself and physical quantities within it unchanged. 

Let the coordinates x” be changed by an infinitesimal amount € with a prescribed shape €“(a) as a function 
of spacetime, 


ch — r" =g" + eh (7.124) 


The Lie derivative of a coordinate tensor A is defined such that the change in the coordinate tensor 
under the coordinate transformation (7.124) is given by e times minus its Lie derivative, denoted LA 


AN (a) > ANE: () = ARM (x) — Le AR . (7.125) 
Equivalently, 
AFA — Alrr 
Ce AR: = lim 28E) — A) (7.126) 
«0 € 


The reason for the minus sign in the definition (7.125) of the Lie derivative is that, as will be seen below, 
equation (7.151), the principal term in the expansion of the Lie derivative of a tensor A: in terms of ordinary 
derivatives is just its directed derivative along the direction €", 


a PO (7.127) 
Ont 


As its name suggests, the Lie derivative acts like a derivative: it is linear, and it satisfies the Leibniz rule. 
The Lie derivative is also a covariant derivative: the Lie derivative of a coordinate tensor is a coordinate 
tensor. Whereas the usual covariant derivative of a tensor is a tensor of rank one higher, the Lie derivative 
of a tensor is a tensor of the same rank. The Lie derivative can be expressed entirely in terms of coordinate 
derivatives without any connection coefficients, or equivalently in terms of torsion-free covariant derivatives. 


LeA = E" 


Concept question 7.18. What use is a Lie derivative? Answer. The general rule to remember is that 
the change in any object under an infinitesimal coordinate transformation is, by construction, (minus) its Lie 
derivative. A prominent application of the Lie derivative is in general relativistic perturbation theory, Chap- 
ter 26, where it is essential to distinguish between genuine physical perturbations of the spacetime geometry 
and perturbations associated with transformations of the coordinates. Another important application of the 
Lie derivative is to derive the general relativistic law of conservation of energy-momentum, §16.11.2. The 
conservation law is a consequence of symmetry of the general relativistic action under coordinate transfor- 
mations. Finally (the nominal motivation for introducing Lie derivatives here), if a spacetime possesses some 
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special symmetry under a coordinate transformation, then that symmetry may be expressed as the vanishing 
of the Lie derivative of the metric with respect to the symmetry, equation (7.117). 


7.34.1 The difference between the covariant derivative and the Lie derivative 


The usual covariant derivative of a tensor A (dropping indices for brevity) follows from the difference between 
the tensor A(x’) evaluated at a shifted position x’, and the tensor A(x) evaluated at the original position x 
parallel-transported to the shifted position 2’, 


DA x A(z’) = A(T)parallel-transported G (7.128) 


Now if the shift between x’ and x is the result of an infinitesimal coordinate transformation, x’ = x + e£. 
then there is another object A’(2’) available, which is the tensor A(x) transformed into the new (primed) 
coordinate frame. The Lie derivative is the difference between the tensor A(x’) evaluated at a shifted position 
x’, and the tensor A(x) evaluated at the original position x, transformed into the new frame, and parallel- 
transported to the shifted position 2’, 


LA x A(z’) = Ala’ \parailel—transpõrted : (7.129) 


Concept question 7.18 discusses the physical justification for this mathematical artifice. 


7.34.2 Lie derivative of a coordinate scalar 
Under a coordinate transformation (7.124), a coordinate-frame scalar ®(x) remains unchanged 
P(x) > P'(x') = P(x). (7.130) 


Here the scalar ®'(x') is evaluated at position x’, which is the same as the original physical position x since 
all that has changed is the coordinates, not the physical position. However, the Lie derivative gives the 
change in a tensor evaluated at fixed coordinate position x, not at fixed physical position. The value of ®’ 
at x is related to that at x’ by 


P' (x) = O(a’ — e£) = P' (x) — eé" (7.131) 


Since € is a small quantity, and ©’ differs from ® by a small quantity, the last term «€"0®'/Ox" in equa- 
tion (7.135) can be replaced by e€"0@/0x" to linear order in e. Putting equations (7.130) and (7.131) together 
shows that the coordinate scalar ® changes under a coordinate transformation (7.124) as 


(x) > P' (x) = P(x) — Le? , (7.132) 


where Le® is the Lie derivative of the scalar ®, 


O® 


aes Ox" 


a coordinate scalar . (7.133) 
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7.34.3 Lie derivative of a contravariant coordinate vector 


A similar argument applies to coordinate vectors. Under an infinitesimal coordinate transformation (7.124), 
a contravariant coordinate 4-vector A(x) transforms in the usual way as 


Iu H 
a = A" (x) + eA" (x) as ; 
ox" 


As in the scalar case, the vector A” (x') is evaluated at position x’, which is the same as the original physical 
position since all that has changed is the coordinates, not the physical position. Again, the Lie derivative 
gives the change in the vector evaluated at coordinate position x, not x’. The value of A” at x is related to 
that at x’ by 


A” (x) > A” (x') = A" (x) 


(7.134) 


OA" 
ox" 
The last term «€"OA"/Ox" in equation (7.135) can be replaced by e€"0A“/Ox" to linear order in the 


infinitesimal parameter e. Putting equations (7.134) and (7.135) together shows that the coordinate 4-vector 
A" changes under a coordinate transformation (7.124) as 


A’ (x) = A" (x' — €€) = A” (x) — €&" 


(7.135) 


A" (x) > A (x) = A" (x) — Le A" , (7.136) 


where £, A" is the Lie derivative of the contravariant vector A”, 


OAV p OE" 


Al = €" 
E £ Ox" Ox" 


a coordinate vector . (7.137) 


The ordinary partial derivatives in equation (7.137) can be replaced by torsion-free covariant derivatives 
the ° atop D,, is a reminder that it is the torsion-free covariant derivative 
P 


LA" = ¿t D, A" — A" Dpt! a coordinate vector . (7.138) 


The replacement by a torsion-free covariant derivative holds because the contribution Te (EF A” — A'E”) 
from the torsion-free coordinate connection vanishes, because the torsion-free connection is symmetric in its 
last two indices, equation (2.56). Equation (7.138) holds, and the Lie derivative is a tensor, regardless of 
whether torsion is present. An equivalent expression for the Lie derivative of a coordinate vector A“ in terms 
of torsion-full covariant derivatives D,, is 


L.A" = €°D,. A} — A" DE" + AME Se a coordinate vector , (7.139) 


where S% is the torsion. The torsion term in equation (7.139) is just such as to cancel the torsion part of 
the torsion-full covariant derivatives. 


Exercise 7.19. Equivalence of expressions for the Lie derivative. Confirm that equations (7.137), 
(7.138), and (7.139) are all equivalent. 
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7.34.4 Lie bracket 


If A” and B” are two contravariant coordinate vectors, then the Lie derivative with respect to A” of B” is 
minus the Lie derivative with respect to B” of A”, 
OB" OA" 


L~aB" = A" —— — B®~— = —LpgA" a coordinate vector . (7.140) 
Ox" ox" 


This antisymmetric property motivates defining the antisymmetric Lie bracket of two vectors A = e,, A” 
and B =e, B" to be 


[A, B] = LAB =e,L,4B" = —[B, Al]. (7.141) 


The Lie bracket elevates the space of vectors on the manifold to a Lie algebra. 


Exercise 7.20. Commutator of Lie derivatives. 
1. Show that if A, B, and C are vectors, then the commutator of Lie derivatives of C is 


ILa, £B]C = |[A, B], C] . (7.142) 
2. Show that the commutator of Lie derivatives is the Lie derivative of the commutator, 
[La, £B] = Lia,B] - (7.143) 


Solution. 
1. This is an application of the Jacobi identity 


(A, [B, C]] + [B, [C, Al] + [C, [A, B] =0 . (7.144) 
The commutator of Lie derivatives of C is 


(La, £B]C = La(LBC) z Lp(LaC) = [A, |B, C] |B, |A, C] = [[A, B], C] $ (7.145) 


2. It is straightforward to check that equation (7.143) holds when acting on scalars. Equation (7.143) also 
holds when acting on vectors, since the rightmost side of equation (7.145) is, from equation (7.141), 
Lia,B\C, so that for vectors C, 


[La,LplC = Lia pyc : (7.146) 
Since L4 and £p satisfy the Leibniz rule, so also does their commutator [L 4, £p]. It then follows that 


equation (7.143) holds when acting on arbitrary products. Thus equation (7.143) holds acting on a 
general tensor. 
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7.34.5 Lie derivative of a covariant coordinate vector 


Under a coordinate transformation (7.124), a covariant coordinate 4-vector A,,(x) transforms in the usual 
way as 

Ox" Og" 
Ox! = A(x) = €A,,(x) Art . 
Again, the vector A’,(a’) is evaluated at position x’, which is the same as the original physical position « 
since all that has changed is the coordinates, not the physical position. And again, the Lie derivative gives 
the change in the vector evaluated at coordinate position x, not the physical position x’. The value of A, at 
x is related to that at x’ by 


A,(«) + A (2) = A, (2) (7.147) 


Ai, (2) = Aj, (a — e£) = Ai, (x') — e£" = (7.148) 


and again, the last term e€"0A/,/0x" in equation (7.148) can be replaced by e€*0A,,/Ox" to linear order in 
e. Putting equations (7.134) and (7.135) together shows that the covariant coordinate 4-vector A, changes 
under a coordinate transformation (7.124) as 


A, (a) > A(x) = Au (£) — Le Ay , (7.149) 
where £¢A,, is the Lie derivative of the covariant vector A,,, 
OA ü 
LeA, =E" on + Agee a coordinate vector . (7.150) 


As in the Lie derivative of a contravariant vector, equation (7.138), the coordinate derivatives in the Lie 
derivative (7.150) of a covariant vector can be replaced by torsion-free covariant derivatives. 


7.34.6 Lie derivative of a coordinate tensor 


In general, the Lie derivative of a coordinate tensor ae is defined by 


= gr oA Ata og" Ara og" AT og" Att ag" 


L AFA f 
s da" 


LV... a coordinate tensor 


TV... ApH HT. AY Sai HV... AT HV.. AT , 
(7.151) 
with an overall 0A term, and a +0€ term for each covariant index and a —O€ term for each contravariant 
index. As in the Lie derivative of a vector, equation (7.138), the coordinate derivatives in the Lie deriva- 
tive (7.151) of a tensor can be replaced by torsion-free covariant derivatives, 
Le Ar — OD pA + ASA D ET + Ane DuEn eo A Dré" — ARTEA a coordinate tensor . 
(7.152) 


Equivalently, in terms of torsion-full covariant derivatives, 


BAe =O DA FAR Duk" + Ape DET ee = A Dek — ANT Da a 


iira Milis 


+ (ARST, + ARM ST... — ATSE, — ART SÀ) €? a coordinate tensor . (7.153) 
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Exercise 7.21. Lie derivative of the metric. What is the Lie derivative of the metric tensor gy along 
the direction “°? 
Solution. The Lie derivative of the metric g,,, along €" is 


r OO oe" og" 
LeQuv = E n F Ov Ər” H Jurk Ix” 
oé, oé, - < 
= A 
Ox! j Ox” my’ 
= Duby + Dun, (7.154) 


where Peyu is the torsion-free coordinate-frame connection, equation (2.63), and D, is the torsion-free 
covariant derivative. 


Exercise 7.22. Lie derivative of the inverse metric. Show that the Lie derivative of a Kronecker delta 
is zero, 


Leb), =0. (7.155) 
Show that the Lie derivative of the inverse metric tensor g^ is 
Lig = —g" g Leguy = -(D"&> + DE") ' (7.156) 


Exercise 7.23. Lie derivative of the metric determinant. Show that the Lie derivative of the metric 
determinant is 


Oln|g| Og" 


— ql — gH SSO 
Leln |g] = g” Leguv = Ẹ re eT 


Solution. The first equality of equation (7.157) follows because a Lie derivative is a variation (with respect to 
a coordinate transformation), and the variation of the determinant of any matrix is given by equation (2.77). 
The second equality of equation (7.157) follows from the first line of equations (7.154). 


(7.157) 
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Reissner-Nordstr6m Black Hole 


The Reissner-Nordstrém geometry, discovered independently by Hans Reissner (1916), Hermann Weyl (1917), 
and Gunnar Nordström (1918), describes the unique spherically symmetric static solution for a black hole 
with mass and electric charge in asymptotically flat spacetime. 

As with the Schwarzschild geometry, the mathematics of the Reissner-Nordstr6m geometry was under- 
stood long before conceptual understanding emerged. The meaning of the Reissner-Nordstrém geometry was 
eventually clarified by Graves and Brill (1960). 


8.1 Reissner-Nordstr6m metric 


The Reissner-Nordstré6m metric for a black hole of mass M and electric charge Q is, in geometric units 
c=G=1, 


ds? = — A dẹ? + A“'dr? + do? |, (8.1) 
where A(r) is the horizon function, 
2M $ 
ieia Ly (8.2) 
r r 


The Reissner-Nordstrém metric (8.1) looks like the Schwarzschild metric (7.1) with the replacement 


M>Mo)=M- Ù. (8.3) 


The quantity M (r) in equation (8.3) has a coordinate-independent interpretation as the mass M (r) interior 
to radius r, which here is the mass M at infinity, minus the mass in the electric field E = Q/r? outside r, 


oo E? 3 oo Q? 5 Q? 
f g dr =f ae drr dr = p (8.4) 


The units of Q here are gaussian; in Heaviside units the electric field is E = Q/(4rr?), the energy density 
is Æ?/2, and the charge term in the horizon function would be Q?/(87r?). Equations (8.4) seem like a 
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Newtonian calculation of the energy in the electric field, but it turns out to be valid also in general relativity, 
essentially because the radial electric field E is unchanged by a Lorentz boost along the radial direction. 

Real astronomical black holes probably have very little electric charge, because the Universe as a whole 
appears almost electrically neutral (and Maxwell’s equations in fact demand that the Universe in its entirety 
should be exactly electrically neutral), and a charged black hole would quickly neutralize itself. It would 
probably not neutralize itself completely, but have some small residual positive charge, because protons 
(positive charge) are more massive than electrons (negative charge), so it is slightly easier for protons than 
electrons to overcome a Coulomb barrier. 

Nevertheless, the Reissner-Nordstr6ém solution is of more than passing interest because its internal geom- 
etry resembles that of the Kerr solution for a rotating black hole. 


Concept question 8.1. Units of charge of a charged black hole. What is the charge Q in standard 
(either gaussian or SI) units? 


8.2 Energy-momentum tensor 


The Einstein tensor of the Reissner-Nordstrém metric (8.1) is diagonal, with elements given by 


Go 0 0 -p 0 0 0 = 0 00 
0 G0 0 0 p 0 0 Q? 0 —1 0 0 
G! = r =8 =“ 8.5 
n 0 0 G o0 "lo 0 pm 0 Alo 0 10 ee) 
0 0 0 G 0 0 0 p 0 0 01 


The trick of writing one index up and the other down on the Einstein tensor GY, partially cancels the 
distorting effect of the metric, yielding the proper energy density p, the proper radial pressure p,, and 


transverse pressure p], up to factors of +1. A more systematic way to extract proper quantities is to work 
in the tetrad formalism, Chapter 11. 
The energy-momentum tensor is that of a radial electric field 


p= @ (8.6) 
Notice that the radial pressure py is negative, while the transverse pressure p, is positive. It is no coincidence 
that the sum of the energy density and pressures is twice the energy density, p + pr + 2p, = 2p. 

The negative pressure, or tension, of the radial electric field produces a gravitational repulsion that domi- 
nates at small radii, and that is responsible for much of the strange phenomenology of the Reissner-Nordstr6m 
geometry. The gravitational repulsion mimics the centrifugal repulsion inside a rotating black hole, for which 
reason the Reissner-Nordstrém geometry is often used a surrogate for the rotating Kerr-Newman geometry. 

At this point, the statements that the energy-momentum tensor is that of a radial electric field, and that 
the radial tension produces a gravitational repulsion that dominates at small radii, are true but unjustified 
assertions. 
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8.3 Weyl tensor 


As with the Schwarzschild geometry (indeed, any spherically symmetric geometry), only 1 of the 10 inde- 
pendent spin components of the Weyl tensor is non-vanishing, the real spin-0 component, the Weyl scalar 
C. The Weyl scalar for the Reissner-Nordstré6m geometry is 


M Q 
The Weyl scalar goes to infinity at zero radius, 
C=>œ asr>0, (8.8) 


signalling the presence of a genuine singularity at zero radius, where the curvature, the tidal force, diverges. 


8.4 Horizons 


The Reissner-Nordström geometry has not one but two horizons. The horizons occur where an object at rest 
in the geometry, dr = dð = dd = 0, follows a null geodesic, ds? = 0, which occurs where the horizon function 
A, equation (8.2), vanishes, 


A=0. (8.9) 
This is a quadratic equation in r, and it has two solutions, an outer horizon r, and an inner horizon r_ 
re =M+/M?2-Q?. (8.10) 


It is straightforward to check that the Reissner-Nordstr6m time coordinate t is timelike outside the outer 
horizon, r > r4}, spacelike between the horizons r- < r < r4}, and again timelike inside the inner horizon 


r <r_. Conversely, the radial coordinate r is spacelike outside the outer horizon, r > r+, timelike between 
the horizons r_ < r < r,, and spacelike inside the inner horizon r < r_. 

The physical meaning of this strange behaviour is akin to that of the Schwarzschild geometry. As in the 
Schwarzschild geometry, outside the outer horizon space is falling at less than the speed of light; at the outer 
horizon space hits the speed of light; and inside the outer horizon space is falling faster than light. But a new 
ingredient appears. The gravitational repulsion caused by the negative pressure of the electric field slows 
down the flow of space, so that it slows back down to the speed of light at the inner horizon. Inside the inner 
horizon space is falling at less than the speed of light. 


8.5 Gullstrand-Painlevé metric 


Deeper insight into the Reissner-Nordstr6m geometry comes from examining its Gullstrand-Painlevé metric. 
The Gullstrand-Painlevé metric for the Reissner-Nordstr6m geometry has the same form as that for the 
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Figure 8.1 Depiction of the Gullstrand-Painlevé metric for the Reissner-Nordström geometry, for a black hole of charge 
Q = 0.96M. The Gullstrand-Painlevé line-element defines locally inertial frames attached to observers who free-fall 
radially from zero velocity at infinity. Frames fall at less than the speed of light outside the outer horizon, hit the 
speed of light at the outer horizon, and fall faster than light in the black hole region inside the outer horizon. The 
gravitational attraction from the mass of the black hole is counteracted by a gravitational repulsion produced by the 
tension (negative radial pressure) of the electric field. The repulsion grows stronger at smaller radii, slowing the inflow. 
The inflow slows back down to the speed of light at the inner horizon, comes to a halt at the turnaround radius, turns 
around, and accelerates outward. Now moving outward, the flow hits the speed of light at the inner horizon, and passes 
outward through the inner horizon into a new region of spacetime, a white hole, where frames are moving outward 
faster than light. The repulsion from the tension of the electric field weakens at larger radii, slowing the outflow. The 
outflow drops back down to the speed of light at the outer horizon of the white hole, and exits the outer horizon into 
a new piece of spacetime. 


Schwarzschild geometry, 


ds? = — dt} + (dr — b dtg)? + r°do? . (8.11) 
The velocity 6 is again the escape velocity, but this is now 


B=F IMr) A (8.12) 


r 


where M(r) = M — Q?/2r is the interior mass already given as equation (8.3). Horizons occur where the 
magnitude of the velocity 8 equals the speed of light 


[A|=1, (8.13) 


which happens at the outer and inner horizons r = r} and r = r_, equation (8.10). 
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The Gullstrand-Painlevé metric once again paints the picture of space falling into the black hole. Outside 
the outer horizon r} space falls at less than the speed of light, at the horizon space falls at the speed of 
light, and inside the horizon space falls faster than light. But the gravitational repulsion produced by the 
tension of the radial electric field starts to slow down the inflow of space, so that the infall velocity reaches 
a maximum at r = Q?/M. The infall slows back down to the speed of light at the inner horizon r_. Inside 
the inner horizon, the flow of space slows all the way to zero velocity, G6 = 0, at the turnaround radius 


Q? 


ro = on `’ 


(8.14) 
Space then turns around, the velocity 8 becoming positive, and accelerates back up to the speed of light. 
Space is now accelerating outward, to larger radii r. The outfall velocity reaches the speed of light at the 
inner horizon r_, but now the motion is outward, not inward. Passing back out through the inner horizon, 
space is falling outward faster than light. This is not the black hole, but an altogether new piece of spacetime, 
a white hole. The white hole looks like a time-reversed black hole. As space falls outward, the gravitational 
repulsion produced by the tension of the radial electric field declines, and the outflow slows. The outflow 
slows back to the speed of light at the outer horizon r+ of the white hole. Outside the outer horizon of the 
white hole is a new universe, where once again space is flowing at less than the speed of light. 

What happens inward of the turnaround radius ro, equation (8.14)? Inside this radius the interior mass 
M(r), equation (8.3), is negative, and the velocity 8 is imaginary. The interior mass M (r) diverges to negative 
infinity towards the central singularity at r — 0. The singularity is timelike, and infinitely gravitationally 
repulsive, unlike the central singularity of the Schwarzschild geometry. Is it physically realistic to have a 
singularity that has infinite negative mass and is infinitely gravitationally repulsive? Undoubtedly not. 


8.6 Radial null geodesics 
In Reissner-Nordstrém coordinates, light rays that fall radially (dð = dé = 0) follow 


dr 
dt 


LA. (8.15) 


Equation (8.15) shows that dr/dt > 0 as r > r+, suggesting that null rays can never cross a horizon. As in 
the Schwarzschild geometry, this is an artefact of the choice of coordinate system. As in the Schwarzschild 
geometry, the Reissner-Nordstrém metric (8.1) appears singular at the horizons, where A = 0, but this is a 
coordinate singularity, not a true singularity, as is evident from the fact that the Riemann curvature tensor 


remains finite at the horizons. 

Figure 8.2 shows a spacetime diagram of the Reissner-Nordstr6m geometry in Reissner-Nordstré6m coor- 
dinates. The spacetime diagram illustrates the apparent freezing of infalling and outgoing null geodesics at 
both outer and inner horizons. 
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Figure 8.2 Spacetime diagram of the Reissner-Nordström geometry, in Reissner-Nordström coordinates, for a black 
hole of charge Q = 0.8M, plotted in units of the outer horizon radius r} of the black hole. The geometry has two 
horizons (pink), an outer horizon, and an inner horizon at r_ = 0.25r,. The more or less diagonal lines (black) are 
outgoing and infalling null geodesics. The outgoing and infalling null geodesics appear not to cross the horizon, but 
this is an artefact of the Reissner-Nordstrém coordinate system. 


8.7 Finkelstein coordinates 


Finkelstein and Kruskal-Szekeres coordinates can be constructed for the Reissner-Nordstrém geometry just 
as in the Schwarzschild geometry. 
Introduce the tortoise coordinate r* defined by 


dr r r 

*= | == In}1 In} 1 8.16 
r A Pr ee T4 ta a ( ) 

where «+ are the surface gravities at the two horizons 

Fp= f 
K+ = 1 Ore . (8.17) 
Radially infalling and outgoing null geodesics follow 

r*+t=constant infalling , (8.18) 


r* —t=constant outgoing . 
Finkelstein time tp is defined by 
tp+r=t+r*, (8.19) 
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Figure 8.3 Finkelstein spacetime diagram of the Reissner-Nordstrém geometry, for a black hole of charge Q = 0.8M, 
plotted in units of the outer horizon radius r+ of the black hole. The Finkelstein time coordinate tp is constructed so 
that radially infalling light rays are at 45°. 


which is constructed so that infalling null rays follow tp +r = 0. Figure 8.3 shows the Finkelstein spacetime 
diagram of the Reissner-Nordstr6m geometry. 


8.8 Kruskal-Szekeres coordinates 


With respect to the coordinates t and r*, the Reissner-Nordstrém line-element is 
ds” = A (— dt? + dr**) + do? . (8.20) 


This metric is still ill-behaved at the horizons, where A = 0 and where the tortoise coordinate r* diverges 
logarithmically, with r* > —co as r > r} and r* = +00 as r > r_. The misbehaviour at the two horizons 
can be removed by transforming to Kruskal coordinates rg and tx defined by 


rKttk= f(r* + t) ; (8.21a) 
re — tpg =sf(r* —t)+2nk, (8.21b) 
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Figure 8.4 Kruskal spacetime diagram of the Reissner-Nordström geometry, plotted in units of k, equation (8.23), 
for a black hole of charge Q = 0.96M. The Kruskal coordinates tx and rx are defined by equations (8.21), and are 
constructed so that radially infalling and outgoing light rays are at 45°. Lines of constant Reissner-Nordstrém time t 
(violet), and infalling and outgoing null lines (black) are spaced uniformly at intervals of 1 (units r+ = 1), while lines 
of constant circumferential radius r (blue) are spaced uniformly in the tortoise coordinate r*, equation (8.16), so that 
the intersections of t and r lines are also intersections of infalling and outgoing null lines. 


where the function f(z) is 


=. z<0, 
Ky 
fey (8.22) 
+k z>0, 
KL 


which varies from f(z) > 0 as z + —o0, to f(z) > k as z > +00, and is continuous and differentiable at 
the junction z = 0. The constant k is 
1 1 re +r? 
k= ae As: ) . (8.23) 


K4 K ry —7r 


The constants s and n in equation (8.21b) are a sign and an integer that fix the sign and offset of the Kruskal 
coordinates in each quadrant of the Kruskal diagram. Figure 8.4 shows the resulting Kruskal spacetime 
diagram, containing three quadrants, a region outside the outer horizon, a region between the two horizons, 
and a region inside the inner horizon. The integers {s,n} in the three quadrants are {1,0} in the region 
outside the outer horizon, {—1,0} in the region between the two horizons, and {1,—1} in the region inside 
the inner horizon. 
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Figure 8.5 Kruskal spacetime diagram of the analytically extended Reissner-Nordstrém geometry, plotted in units of 
k, equation (8.23), for a black hole of charge Q = 0.96M. 


The transformation (8.21) to Kruskal coordinates brings infinite time t and radius r to finite values, as in 
a Penrose diagram. This is associated with the fact that the tortoise coordinate r* is +00 at both r = œ and 
r = r_,so any transformation of r* +t that maps the inner horizon r_ to a finite coordinate also maps infinite 


radius to a finite coordinate. It would be possible to allow rx to be infinite at infinite r, as in Schwarzschild, 
by choosing different Kruskal coordinate transformations for the regions near the inner horizon and near 
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infinity, but it is advantageous to enforce the same transformation, since the Kruskal coordinate system can 
then be extended analytically across both inner and outer horizons. 

The Kruskal diagram 8.4 shows that the singularity of the Reissner-Nordstrém geometry is timelike, not 
spacelike. This is associated with the fact that the singularity is gravitationally repulsive, not attractive. 

The Penrose diagram of the Reissner-Nordstr6m geometry is commonly drawn with the singularity vertical. 
The singularity in the Kruskal diagram 8.4 is not vertical. It is possible to construct Kruskal-like coordinates 
such that the singularity is vertical in the resulting spacetime diagram, for example by setting k- = —kK4+ 
in the Kruskal transformation formulae (8.22) and (8.23). However, the metric coefficients in tx and rx are 
then zero, not finite, at the inner horizon. If the metric coefficients are required to be finite at both outer 
and inner horizons, then it is impossible to construct a Kruskal coordinate transformation that makes the 
singularity vertical. 


8.9 Analytically extended Reissner-Nordstr6m geometry 


Like the Schwarzschild geometry, the Reissner-Nordstré6m geometry can be analytically extended. Figure 8.5 
shows the Kruskal spacetime diagram of the analytically extended geometry. The extension is considerably 
more complicated than that for Schwarzschild, as discussed in the next section. 


8.10 Penrose diagram 


Figure 8.6 shows a Penrose diagram of the analytic continuation of the Reissner-Nordstrém geometry. This 
is essentially a schematic version of the Kruskal diagram 8.5, with the various parts of the geometry labelled. 
The analytic continuation consists of an infinite ladder of universes and parallel universes connected to each 
other by black hole — wormhole — white hole tunnels. I call the various pieces of spacetime “Universe,” 
“Parallel Universe,” “Black Hole,” “Wormhole,” “Parallel Wormhole,” and “White Hole.” These pieces repeat 
in an infinite ladder. The various horizons in the Penrose diagram are labelled with descriptive names. 
Relativists tend to use more abstract. terminology. 

The Wormhole and Parallel Wormhole contain separate central singularities, the “Singularity” and the 
“Parallel Singularity,” which are oppositely charged. If the black hole is positively charged as measured by 
observers in the Universe, then it is negatively charged as measured by observers in the Parallel Universe, 
and the Wormhole contains a positive charge singularity while the Parallel Wormhole contains a negative 
charge singularity. 

Where does the electric charge of the Reissner-Nordstr6m geometry “actually” reside? This comes down 
to the question of how observers detect the presence of charge. Observers detect charge by the electric field 
that it produces. Equip all (radially moving) observers with a gyroscope that they orient consistently in 
the same radial direction, which can be taken to be towards the black hole as measured by observers in 
the Universe. Observers in the Parallel Universe find that their gyroscope is pointed away from the black 
hole. Inside the black hole, observers from either Universe agree that the gyroscope is pointed towards the 
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Figure 8.6 Penrose diagram of the analytically extended Reissner-Nordstrém geometry. 
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Wormhole, and away from the Parallel Wormhole. All observes agree that the electric field is pointed in the 
same radial direction. Observers who end up inside the Wormhole measure an electric field that appears to 
emanate from the Singularity, and which they therefore attribute to charge in the Singularity. Observers 
who end up inside the Parallel Wormhole measure an electric field that appears to emanate in the opposite 
direction from the Parallel Singularity, and which they therefore attribute to charge of opposite sign in the 
Parallel Singularity. Strange, but all consistent. 


8.11 Antiverse: Reissner-Nordstr6m geometry with negative mass 


It is also possible to consider the Reissner-Nordstr6m geometry for negative values of the radius r. I call the 
extension to negative r the “Antiverse.” There is also a “Parallel Antiverse.” 

Changing the sign of r in the Reissner-Nordstrém metric (8.1) is equivalent to changing the sign of the 
mass M. Thus the Reissner-Nordstr6m metric with negative r describes a charged black hole of negative 
mass 


M <0. (8.24) 


The negative mass black hole is gravitationally repulsive at all radii, and it has no horizons. 


8.12 Outgoing, ingoing 


The black hole in the Reissner-Nordstr6m geometry has not one but two inner horizons. The inner horizon 
plays a central role in the inflationary instability described in §8.13 below. 

The inner horizons can be called outgoing and ingoing. Persons freely falling in the Black Hole region are 
all moving inward in coordinate radius r, but they may be moving either forward or backward in Reissner- 
Nordström coordinate time t. In the Black Hole region, the conserved energy along a geodesic is positive if 
the time coordinate t is decreasing, negative if the time coordinate t is increasing!. Persons with positive 
energy are ingoing, while persons with negative energy are outgoing. Both outgoing and ingoing persons 
fall inward, to smaller radii, but outgoing persons think that the inward direction is towards the Parallel 
Wormhole, while ingoing persons think that the inward direction is in the opposite direction, towards the 
Wormhole. Outgoing persons fall through the outgoing inner horizon, while ingoing persons fall through the 
ingoing inner horizon. 

Coordinate time t moves forwards in the Universe and Wormhole regions, and geodesics have positive 
energy in these regions. Conversely, coordinate time t moves backwards in the Parallel Universe and Parallel 
Wormhole regions, and geodesics have negative energy in these regions. Of course, all observers, wherever 


1 The fact that positive energy geodesics go backwards in Reissner-Nordstrém coordinate time t in the Black Hole region is 
counter-intuitive, but it does make sense. An outgoing infaller who fell through the horizon earlier can meet an ingoing 
infaller who falls in later. Thus outgoers, who have negative energy, progress forward in time t, while ingoers, who have 
positive energy, progress backward in time t. 
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they may be, always perceive their own proper time to be moving forward in the usual fashion, at the rate 
of one second per second. 


8.13 The inflationary instability 


Roger Penrose (1968) first pointed out that a person passing through the outgoing inner horizon (also 
called the Cauchy horizon) of the Reissner-Nordstr6m geometry would see the outside Universe infinitely 
blueshifted, and he suggested that this would destabilize the geometry. Perturbation theory calculations, 
starting with Simpson & Penrose (1973) and culminating with Chandrasekhar and Hartle (1982), confirmed 
that waves become infinitely blueshifted as they approach the outgoing inner horizon, and that their energy 


Figure 8.7 Penrose diagram illustrating why the Reissner-Nordstrém geometry is subject to the inflationary instability. 
Outgoing and ingoing streams just outside the inner horizon must pass through separate outgoing and ingoing inner 
horizons into causally separated pieces of spacetime where the timelike time coordinate t goes in opposite directions. To 
accomplish this, the outgoing and ingoing streams must exceed the speed of light through each other, which physically 
they cannot do. The inflationary instability is driven by the pressure of the relativistic counter-streaming between 
outgoing and ingoing streams. The inset shows the direction of coordinate time ¢ in the various regions. Proper time 
of course always increases upward in a Penrose diagram. 
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density diverges. The perturbation theory calculations were widely construed as indicating that the Reissner- 
Nordström geometry was “unstable,” although the precise nature of this instability remained obscure. 

It was not until a seminal paper by Poisson & Israel (1990) that the nonlinear nature of the instability at 
the inner horizon was clarified. Poisson & Israel showed that the Reissner-Nordstr6m geometry is subject to 
an exponentially growing instability which they dubbed mass inflation. The term refers to the fact that the 
interior mass M(r) grows exponentially during mass inflation. The interior mass M(r) has the property of 
being a gauge-invariant, scalar quantity, so it has a physical meaning independent of the coordinate system. 

What causes mass inflation? Actually it has nothing to do with mass: the inflating mass is just a symptom 
of the underlying cause. What causes mass inflation is relativistic counter-streaming between outgoing and 
ingoing streams. Since the name mass inflation can be misleading, I prefer to call it the inflationary 
instability. As the Penrose diagram of the Reissner-Nordstrém geometry shows, outgoing and ingoing 
streams must drop through separate outgoing and ingoing inner horizons into separate pieces of spacetime, 
the Wormhole and the Parallel Wormhole. The regions of spacetime must be separate because coordinate time 
t is timelike in both regions, but going in opposite directions in the two regions, forward in the Wormhole, 
backward in the Parallel Wormhole, as illustrated in Figure 8.7. In other words, outgoing and ingoing streams 
cannot co-exist in the same subluminal region of spacetime because they would have to be moving in opposite 
directions in time, which cannot be. 

In the Reissner-Nordstr6m geometry, outgoing and ingoing streams resolve their differences by exceeding 
the speed of light relative to each other, and passing into causally separated regions. As the outgoing and 
ingoing streams drop through their respective inner horizons, they each see the other stream infinitely 
blueshifted. 

In reality however, this cannot occur: outgoing and ingoing streams cannot exceed the speed of light relative 
to each other. Instead, as the outgoing and ingoing streams move ever faster through each other in their 
effort to drop through the inner horizon, their counter-streaming generates a radial pressure. The pressure, 
which is positive, exerts an inward gravitational force. As the counter-streaming approaches the speed of 
light, the gravitational force produced by the counter-streaming pressure eventually exceeds the gravitational 
force produced by the background Reissner-Nordstré6m geometry. At this point, the inflationary instability 
begins. 

The gravitational force produced by the counter-streaming is inwards, but, in the strange way that general 
relativity operates, the inward direction is in opposite directions for the ingoing streams, towards the black 
hole for the ingoing stream, and away from the black hole for the outgoing stream. Consequently the counter- 
streaming pressure simply accelerates the outgoing and ingoing streams ever faster through each other. The 
result is an exponential feedback instability. The increasing pressure accelerates the streams faster through 
each other, which increases the pressure, which increases the acceleration. 

The interior mass is not the only thing that increases exponentially during mass inflation. The proper 
density and pressure, and the Weyl scalar (all gauge-invariant scalars) exponentiate together. 
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Exercise 8.2. Blueshift of a photon crossing the inner horizon of a Reissner-Nordstrém black 
hole. Show that, in the Reissner-Nordstr6m geometry, the blueshift of a photon with energy v, = +1 and 
angular momentum per unit energy vı = J observed by observer on a geodesic with energy per unit mass 


u, = —E and angular momentum per unit mass u, = L is (the minus sign in —u,,v“ makes the blueshift 
positive) 
somethin 
—u,v" = e g (8.25) 


Argue that the blueshift diverges at the horizon for outgoing observers observing ingoing photons, and for 
ingoing observers observing outgoing photons. 

Solution. The solution for geodesics is similar to that in the Schwarzschild geometry, Exercise 7.6. The 
radial velocities u” and v” are both necessarily negative just above the inner horizon. The blueshift of a 
photon is 


—u,uv" = — (g were + grru' v” + g+uiv) 
E+ VE- OFEA OAA Ls 
= — = (8.26) 


Note that A is negative between the outer and inner horizons. The + sign of FE is negative if u, and v; 


have the same sign, positive if u, and v; have opposite signs. The latter case holds for outgoing observers 
observing ingoing photons, or for ingoing observers observing outgoing photons, in which case the blueshift 
near the inner horizon, where A — —0, diverges as 


QurVi 


—uy,v" > as A> -0 if wu <0. (8.27) 


8.14 The X point 


The point in the Reissner-Nordstr6m geometry where the outgoing and ingoing inner horizons intersect, 
the X point, is a special one. This is the point through which geodesics of zero energy, E = 0, must pass. 
Persons with zero energy who reach the X point see both outgoing and ingoing streams, coming from opposite 
directions, infinitely blueshifted. 


8.15 Extremal Reissner-Nordstr6m geometry 


So far the discussion of the Reissner-Nordstré6m geometry has centred on the case Q < M (or more generally, 
|Q| < ||) where there are separate outer and inner horizons. In the special case that the charge and mass 
are equal, 


Q=M, (8.28) 
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Figure 8.8 Depiction of the Gullstrand-Painlevé metric for the extremal Reissner-Nordstrém geometry, with Q = M. 
In the extremal geometry, the inner and outer horizons are at the same radius, so there is only one horizon. 


the inner and outer horizons merge into one, r} = r_, equation (8.10). This special case describes the 
extremal Reissner-Nordstrém geometry. 

The extremal Reissner-Nordstr6m geometry is of particular interest in quantum gravity because its Hawk- 
ing temperature is zero, and in string theory because extremal black holes have a higher degree of symmetry, 
making them more tractable for theoretical investigation. 

Figure 8.8 shows the Gullstrand-Painlevé model of an extremal Reissner-Nordstrém black hole. It looks 
like that of a non-extremal Reissner-Nordstrém black hole except that the two horizons merge into one. The 
infall velocity @ into an extremal black hole reaches its maximum, the speed of light, at the horizon. 

The Penrose diagram of the extremal Reissner-Nordstr6m geometry, Figure 8.9, differs from that of the 
standard Reissner-Nordstr6m geometry in having no Black Hole, White Hole, or Parallel regions. The fact 
that extremal black hole differs topologically from a non-extremal black hole suggests that it would be 
physically impossible by any causal mechanism to change a black hole from non-extremal to extremal. 


8.16 Super-extremal Reissner-Nordstr6m geometry 


The Reissner-Nordstr6m geometry with charge greater than mass, 


Q>M, (8.29) 
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Figure 8.9 Penrose diagram of the extremal Reissner-Nordström geometry. 


has no horizons. The geometry is called super-extremal. The change in geometry from an extremal black 
hole, with horizon at finite radius r} = r_ = M, to a super-extremal black hole without horizons is 
discontinuous. This suggests that there is no way to pack a black hole with more charge than its mass. 
Indeed, if you try to force additional charge into an extremal black hole, then the work needed to do so 


increases its mass so that the charge Q does not exceed its mass M. 
Real fundamental particles nevertheless have charge far exceeding their mass. For example, the charge-to- 
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mass ratio of a proton is 
e 
— a 1018 (8.30) 
My 
where e is the square root of the fine-structure constant a = e?/hc ~ 1/137, and mp ~ 107° is the mass 
of the proton in Planck units. However, the Schwarzschild radius of such a fundamental particle is far tinier 
than its Compton wavelength ~ h/m (or its classical radius e?/m = ah/m), so quantum mechanics, not 


general relativity, governs the structure of these fundamental particles. 


8.17 Reissner-Nordstr6m geometry with imaginary charge 


It is possible formally to consider the Reissner-Nordstrém geometry with imaginary charge Q 
Q? <0. (8.31) 


This is completely unphysical. If charge were imaginary, then electromagnetic energy would be negative. 
However, the Reissner-Nordstrém metric with Q? < 0 is well-defined, and it is possible to calculate 
geodesics in that geometry. What makes the geometry interesting is that the singularity, instead of being 
gravitationally repulsive, becomes gravitationally attractive. Thus particles, instead of bouncing off the 
singularity, are attracted to it, and it turns out to be possible to continue geodesics through the singularity. 
Mathematically, the geometry can be considered as the Kerr-Newman geometry in the limit of zero spin. In 


Figure 8.10 Depiction of the Gullstrand-Painlevé metric for a super-extremal Reissner-Nordstrém geometry, with 
Q = 1.04M. The super-extremal geometry has no horizons. 


208 Reissner-Nordstrém Black Hole 


New Parallel Universe New Universe 


Parallel 
Antiverse 


Figure 8.11 Penrose diagram of the Reissner-Nordstré6m geometry with imaginary charge Q. If charge were imaginary, 
then electromagnetic energy would be negative, which is completely unphysical. But the metric is well-defined, and 
the spacetime is fun. 


the Kerr-Newman geometry, geodesics can pass from positive to negative radius r, and the passage through 
the singularity of the Reissner-Nordstr6m geometry can be regarded as this process in the limit of zero spin. 


Suffice to say that it is intriguing to see what it looks like to pass through the singularity of a charged 
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black hole of imaginary charge, however unrealistic. The Penrose diagram is even more eventful than that 
for the usual Reissner-Nordstr6m geometry. 


9 


Kerr-Newman Black Hole 


The geometry of a stationary, rotating, uncharged black hole in asymptotically flat empty space was discov- 
ered unexpectedly by Roy Kerr in 1963 (Kerr, 1963). Kerr’s own account of the history of the discovery is 
at Kerr (2009). You can read in that paper that the discovery was not mere chance: Kerr used sophisticated 
mathematical methods to make it. The extension to a rotating electrically charged black hole was made 
shortly thereafter by Ted Newman (Newman et al., 1965). Newman told me (private communication 2009) 
that, after seeing Kerr’s work, he quickly realised that the extension to a charged black hole was straightfor- 
ward. He set the problem to the graduate students in his relativity class, who became coauthors of Newman 
et al. (1965). 

The importance of the Kerr-Newman geometry stems in part from the no-hair theorem, which states 
that this geometry is the unique end state of spacetime outside the horizon of an undisturbed black hole in 
asymptotically flat space. 


9.1 Boyer-Lindquist metric 


The Boyer-Lindquist metric of the Kerr-Newman geometry is 


RA 


2 
ds* = 2 


2 4 2 
(dt — asin? do) + Ldr? + pdo? 4 2 (a tt) (9.1) 


where R and p are defined by 
R= vr? +a?, p= vyr? +a? cos?6 , (9.2) 
and A is the horizon function defined by 
2Mr | Q? 
Re R’ 
If M = Q = 0, so that A = 1, the Boyer-Lindquist metric (9.1) goes over to the metric of Minkowski space 
expressed in ellipsoidal coordinates. 


A=1 


(9.3) 
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At large radius r, the Boyer-Lindquist metric is 


29 
ds? > (1 ah ) n R ai (1 4 zn dr? + r? (d0? + sin?0 dd?) . (9.4) 
F r r 


By comparison, the weak-field metric in Newtonian gauge, equation (27.62), around an object of mass M 
and angular momentum L takes the form 


ds? = — (1 + 2W)dt? — 2Wr sin 0 dtd + (1 — 28) (dr? + r*do?) , (9.5) 
where, from equations (27.80) and (27.87), the scalar Y, ® and vector W potentials are 


tata.. e = 


r F 


(9.6) 


The asymptotic Boyer-Lindquist metric (9.4) is not quite in the Newtonian form (9.5), but a transformation 
of the radial coordinate brings it to Newtonian form, Exercise 7.1. Comparison of the two metrics establishes 
that M is the mass of the black hole and a = L/M is its angular momentum per unit mass. For positive a, 
the black hole rotates right-handedly about its polar axis 0 = 0. 

The Boyer-Lindquist line-element (9.1) defines not only a metric but also a tetrad. The Boyer-Lindquist 
coordinates and tetrad are carefully chosen to exhibit the symmetries of the geometry. In the locally inertial 
frame defined by the Boyer-Lindquist tetrad, the energy-momentum tensor (which is non-vanishing for 
charged Kerr-Newman) and the Weyl tensor are both diagonal. These assertions becomes apparent only 
in the tetrad frame, §19.3, and are obscure in the coordinate frame. 


9.2 Oblate spheroidal coordinates 


Boyer-Lindquist coordinates r, 0, ġ are oblate spheroidal coordinates (not polar coordinates). Correspond- 
ing Cartesian coordinates are 


x = Rsin@cos¢, 
y = Rsinðsing, (9.7) 
z = rcosé. 


Surfaces of constant r are confocal oblate spheroids, satisfying 
ety 2 
RT 72 


Equation (9.8) implies that the spheroidal coordinate r is given in terms of x, y, z by the quadratic equation 


=e (9.8) 


rt -=r (oe? +y +27 -a)-e =0. (9.9) 


Figure 9.1 illustrates the spatial geometry of a Kerr black hole, and of a Kerr-Newman black hole, in 
Boyer-Lindquist coordinates. 
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Figure 9.1 Spatial geometry of (upper) a Kerr black hole with spin parameter a = 0.96M, and (lower) a Kerr-Newman 
black hole with charge Q = 0.8M and spin parameter a = 0.56M. The upper half of each diagram shows r > 0, while 
the lower half shows r < 0, the Antiverse. The outer and inner horizons are confocal oblate spheroids whose focus is 
the ring singularity. For the Kerr geometry, the turnaround radius is at r = 0. The Sisytube is a torus enclosing the 


ring singularity, that contains closed timelike curves. 
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9.3 Time and rotation symmetries 


The Boyer-Lindquist metric coefficients are independent of the time coordinate t and of the azimuthal angle 
ġ. This shows that the Kerr-Newman geometry has time translation symmetry, and rotational symmetry 
about its azimuthal axis. The time and rotation symmetries means that the tangent vectors e; and eg in 
Boyer-Lindquist coordinates are Killing vectors. It follows that their scalar products 


1 
ei er = Git = - a (RA =g sin?ð) , 
aR? sin” 
ei: €p = Gta = ee Vea : 
2 gi 2 
€¢° €p = Jp = — (R? — a? sin?h A) , (9.10) 
p 


are all gauge-invariant scalar quantities. As will be seen below, g;; = 0 defines the boundary of ergospheres, 
gio = 0 defines the turnaround radius, and gz, = 0 defines the boundary of the sisytube, the toroidal region 
containing closed timelike curves. 

The Boyer-Lindquist time t and azimuthal angle ¢ are arranged further to satisfy the condition that e; 
and eg are each orthogonal to both e, and eg. 


9.4 Ring singularity 


The Kerr-Newman geometry contains a ring singularity where the Weyl tensor (9.26) diverges, p = 0, or 
equivalently at 


r=Oand 0=7/2]. (9.11) 


The ring singularity is at the focus of the confocal ellipsoids of the Boyer-Lindquist metric. Physically, the 
singularity is kept open by the centrifugal force. 
Figure 9.2 illustrates contours of constant p in a Kerr black hole. 


9.5 Horizons 


The horizon of a Kerr-Newman black hole rotates, as observed by a distant observer, so it is incorrect to try 
to solve for the location of the horizon by assuming that the horizon is at rest. The worldline of a photon 
that sits on the horizon, battling against the inflow of space, remains at fixed radius r and polar angle 6, but 
it moves in time t and azimuthal angle ¢. The photon’s 4-velocity is v” = {v',0,0,v%}, and the condition 
that it is on a null geodesic is 


0 = vpu" = guv" = gu(v')? +2 gio vive + goel’)? ; (9.12) 
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Figure 9.2 Not a mouse’s eye view of a snake coming down its mousehole, uhoh. Contours of constant p and their 
covariant normals 0p/0z" in a spatial cross-section of a Kerr black hole of spin parameter a = 0.96M, in Boyer- 
Lindquist coordinates. The thicker contours are the outer and inner horizons, which are confocal spheroids with the 
ring singularity at their focus. The ring singularity is at p = 0, the snake’s eyes. 


This equation has solutions provided that the determinant of the 2 x 2 matrix of metric coefficients in t and 
¢ is less than or equal to zero (why?). The determinant is 


Git Jad — Ge = —R?sin20A , (9.13) 


where A is the horizon function defined above, equation (9.3). Thus if A > 0, then there exist null geodesics 
such that a photon can be instantaneously at rest in r and 6, whereas if A < 0, then no such geodesics exist. 
The boundary 


A= (9.14) 


defines the location of horizons. With A given by equation (9.3), equation (9.14) gives outer and inner 
horizons at 


re=M+tV/M?-Q?-a?]. (9.15) 


Between the horizons A is negative, and photons cannot be at rest. This is consistent with the picture that 
space is falling faster than light between the horizons. 
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9.6 Angular velocity of the horizon 


The angular velocity of the horizon as observed by observers at rest at infinity can be read off directly from 
the Boyer-Lindquist metric (9.1). The horizon is at dr = dð = 0 and A = 0, and then the null condition 
ds? = 0 implies that the angular velocity is 

do a 

do (9.16) 
The derivative is with respect to the proper time t of observers at rest at infinity, so this is the angular 
velocity observed by such observers. 


9.7 Ergospheres 


There are finite regions, just outside the outer horizon and just inside the inner horizon, within which the 
worldline of an object at rest, dr = d0 = dọ = 0, is spacelike. These regions, called ergospheres, are places 
where nothing can remain at rest (the place where little children come from). Objects can escape from within 
the outer ergosphere (whereas they cannot escape from within the outer horizon), but they cannot remain 
at rest there. A distant observer will see any object within the outer ergosphere being dragged around by 
the rotation of the black hole. The direction of dragging is the same as the rotation direction of the black 
hole in both outer and inner ergospheres. 
The boundary of the ergosphere is at 


gt =0, (9.17) 
which occurs where 
R?A =a’ sin’6 . (9.18) 


Equation (9.18) has two solutions, the outer and inner ergospheres. The outer and inner ergospheres touch 
respectively the outer and inner horizons at the poles, 0 = 0 and a. 


9.8 Turnaround radius 


The turnaround radius is the radius inside the inner horizon at which infallers who fall from zero velocity 
and zero angular momentum at infinity turn around. The radius is at 


ge = 0, (9.19) 
which occurs where A = 1, or equivalently at 
Q? 
= 9.20 
"= 3M cal 


In the uncharged Kerr geometry, the turnaround radius is at zero radius, r = 0, but in the Kerr-Newman 
geometry the turnaround radius is at positive radius. 
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9.9 Antiverse 


The surface at zero radius, r = 0, forms a disk bounded by the ring singularity. Objects can pass through 
this disk into the region at negative radius, r < 0, the Antiverse. 

The Boyer-Lindquist metric (9.1) is unchanged by a symmetry transformation that simultaneously flips 
the sign both of the radius and mass, r > —r and M — —M. Thus the Boyer-Lindquist geometry at 
negative r with positive mass is equivalent to the geometry at positive r with negative mass. In effect, the 
Boyer-Lindquist metric with negative r describes a rotating black hole of negative mass 


M <0. (9.21) 


9.10 Sisytube 


Inside the inner horizon there is a toroidal region around the ring singularity, which I call the sisytube, 
within which the light cone in ¢t-¢ coordinates opens up to the point that @ as well as t is a timelike 
coordinate. In the Wormhole, the direction of increasing proper time along t is t increasing, and along ¢ is 
@ decreasing, which is retrograde. In the Parallel Wormhole, the direction of increasing proper time along 
t is t decreasing, and along ¢ is @ increasing, which is again retrograde. Within the toroidal region, there 
exist timelike trajectories that go either forwards or backwards in coordinate time t as they wind retrograde 
around the toroidal tunnel. Because the ¢ coordinate is periodic, these timelike curves connect not only the 
past to the future (the usual case), but also the future to the past, which violates causality. In particular, as 
first pointed out by Carter (1968), there exist closed timelike curves (CTCs), trajectories that connect to 
themselves, connecting their own future to their own past, and repeating interminably, like Sisyphus pushing 
his rock up the mountain. 
The boundary of the sisytube torus is at 


Joo = 0, (9.22) 
which occurs where 
R? =a? sin?0 A. (9.23) 


In the uncharged Kerr geometry the sisytube is entirely at negative radius, r < 0, but in the Kerr-Newman 
geometry the sisytube extends to positive radius, Figure 9.1. 


9.11 Extremal Kerr-Newman geometry 


The Kerr-Newman geometry is called extremal when the outer and inner horizons coincide, r} = r_, which 
occurs where 


M’ =Q +a. (9.24) 
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Figure 9.3 Spatial geometry of (upper) an extremal (a = M) Kerr black hole, and (lower) an extremal Kerr-Newman 


black hole with charge Q = 0.8M and spin parameter a = 0.6M. 


Figure 9.3 illustrates the structure of an extremal Kerr (uncharged) black hole, and an extremal Kerr-Newman 


(charged) black hole. 
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Figure 9.4 Spatial geometry of a super-extremal Kerr black hole with spin parameter a = 1.04M. A super-extremal 


black hole has no horizons. 


9.12 Super-extremal Kerr-Newman geometry 


If M? < Q? +a’, then there are no horizons. The geometry is called super-extremal. Figure 9.4 illustrates 
the structure of a super-extremal Kerr black hole. A super-extremal black hole has a naked ring singularity, 


and CTCs in a sisytube unhidden by a horizon. 


9.13 Energy-momentum tensor 


The coordinate-frame Einstein tensor of the Kerr-Newman geometry in Boyer-Lindquist coordinates is a bit 
of a mess. The trick of raising one index, which for the Reissner-Nordstr6m metric brought the Einstein 
tensor to diagonal form, equation (8.5), fails for Boyer-Lindquist (because the Boyer-Lindquist metric is not 
diagonal). The problem is endemic to the coordinate approach to general relativity. After tetrads it will 
emerge that, in the Boyer-Lindquist tetrad, the Einstein tensor is diagonal, and that the proper density p, 
the proper radial pressure pp, and the proper transverse pressure p] in that frame are (do not confuse the 
notation p for proper density with the radial parameter p, equation (9.2), of the Boyer-Lindquist metric) 


oS. (9.25) 


i E 


This looks like the energy-momentum tensor (8.5) of the Reissner-Nordstrém geometry with the replacement 
r — p. The energy-momentum is that of an electric field produced by a charge Q seemingly located at the 


ring singularity. 
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9.14 Weyl tensor 


The Weyl tensor of the Kerr-Newman geometry in Boyer-Lindquist coordinates is likewise a mess. After 
tetrads, it will emerge that the 10 components of the Weyl tensor can be decomposed into 5 complex 
components of spin 0, +1, and +2. In the Boyer-Lindquist tetrad, the only non-vanishing component is 
the spin-0 component, the Weyl scalar C, but in contrast to the Schwarzschild and Reissner-Nordstré6m 
geometries the spin-O component is complex, not real: 


C= : (m Q? ) . (9.26) 


(r — ia cos 0)3 r + ia cos 


9.15 Electromagnetic field 


The expression for the electromagnetic field in Boyer-Lindquist coordinates is again a mess. After tetrads, 
it will emerge that, in the Boyer-Lindquist tetrad, the electromagnetic field is purely radial, and the electro- 
magnetic potential has only a time component. For reference, the covariant electromagnetic potential A, in 
the Boyer-Lindquist coordinate (not tetrad) frame is 


Qr asin 
An = po {72 0, 0, oe | . (9.27) 


9.16 Principal null congruences 


The Kerr-Newman geometry admits a special set of space-filling, non-overlapping null geodesics called the 
principal outgoing and ingoing null congruences. These are the directions with respect to which the Weyl 
tensor and the electric field vector align. Photons that hold steady on the outer horizon are on the principal 
outgoing null congruence. The construction and special character of the principal null congruences will be 
demonstrated after tetrads, in §23.6. 

Geodesics along the principal null congruences satisfy 


do = dọ- wdt=0 , (9.28) 


where w = a/R? is the azimuthal angular velocity of the geodesics through the coordinates. The Boyer- 
Lindquist line-element (9.1) is specifically constructed so that it aligns with the principal null congruences. 
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9.17 Finkelstein coordinates 


Along the principal outgoing and ingoing null congruences, where equations (9.28) hold, the Boyer-Lindquist 


metric (9.1) reduces to 
2N d 2 
ds? = Ê ( dt? + Z ) l (9.29) 


R? A? 
A tortoise coordinate r* in the Kerr-Newman geometry may be defined analogously to that (8.16) in the 


Reissner-Nordstr6m geometry, 


a (9.30) 


which integrates to the same expressions (8.16) and (8.17) in terms of horizon radii r+ and surface gravities 
k4 as in the Reissner-Nordstrém geometry. Principal outgoing and ingoing null geodesics follow 


r* —t=constant outgoing , 


x ee (9.31) 
r* +t = constant ingoing . 


A Finkelstein time coordinate tp can be defined as in the Reissner-Nordström geometry, equation (8.19). 
Likewise, Kruskal-Szekeres coordinates can be defined as in the Reissner-Nordström geometry, equations (8.21) 
and (8.22). The Finkelstein and Kruskal spacetime diagrams for the Kerr-Newman geometry look identical 
to those of the Reissner-Nordström geometry (if the horizon radii r+ are the same), Figures 8.3 and 8.4. The 
discussion in §§8.7—8.9 carries through essentially unchanged for the Kerr-Newman geometry. 

The behaviour of geodesics in the angular direction is more complicated in the Kerr-Newman than Reissner- 
Nordström geometry, but this complexity is hidden in the Finkelstein and Kruskal diagrams. 


9.18 Doran coordinates 


For the Kerr-Newman geometry, the analogue of the Gullstrand-Painlevé metric is the Doran (2000) metric 


2 
ds? = — dt} 4 $ dr page asin?0 do) + p*d0? + R? sin?6 doz | , (9.32) 
p 


where the free-fall time tẹ and azimuthal angle ¢¢g are related to the Boyer-Lindquist time t and azimuthal 
angle ¢@ by 
B aß 

d dog = dé — =. dr. 9.33 
1-82 r, pa o) R2(1 = 82) T ( ) 
The free-fall time tg is the proper time experienced by persons who free-fall from rest at infinity, with zero 
angular momentum. They follow trajectories of fixed 0 and ġg, with radial velocity dr/dtg = 8R?/p?. The 
4-velocity u” = dz” /dr of such free-falling observers is 


„ RPB 
= pa j 


dtg = dt — 


u? =0, u =0. (9.34) 


ut =1, u 
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Figure 9.5 Spatial geometry of a Kerr black hole with spin parameter a = 0.96M. The arrows show the velocity 8 
in the Doran metric. The flow follows lines of constant 8, which form nested hyperboloids orthogonal to and confocal 


with the nested spheroids of constant r. 


For the Kerr-Newman geometry, the velocity 8 is 
\/2Mr — Q? 
B=F T= (9.35) 
R 
where the F sign is — (infalling) for black hole solutions, and + (outfalling) for white hole solutions. 
Horizons occur where the magnitude of the velocity 8 equals the speed of light 


B=7F1. (9.36) 


The boundaries of ergospheres occur where the velocity is 


__?P 
B= Po (9.37) 


The turnaround radius is where the velocity is zero 


B=0. (9.38) 


The sisytube is bounded by the imaginary velocity 
fe, (9.39) 
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Figure 9.6 Penrose diagram of the Kerr-Newman geometry. The diagram is similar to that of the Reissner-Nordström 
geometry, except that it is possible to pass through the disk at r = 0 from the Wormhole region into the Antiverse 
region. This Penrose diagram, which represents a slice at fixed 0 and ¢, does not capture the full richness of the 
geometry, which contains closed timelike curves in a torus around the ring singularity, the sisytube. 
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9.19 Penrose diagram 


The Penrose diagram of the Kerr-Newman geometry, Figure 9.6, resembles that of the Reissner-Nordstr6m 
geometry, Figure 8.6, except that in the Kerr-Newman geometry an infaller can reach the Antiverse by 
passing through the disk at r = 0 bounded by the ring singularity. In the Reissner-Nordstr6m geometry, 
the ring singularity shrinks to a point, and passing into the Antiverse would require passing through the 
singularity itself. 
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21. 


Concept Questions 


What does it mean that the Universe is expanding? 
Does the expansion affect the solar system or the Milky Way? 
How far out do you have to go before the expansion is evident? 
What is the Universe expanding into? 
In what sense is the Hubble constant constant? 
Does our Universe have a centre, and if so where is it? 
What evidence suggests that the Universe at large is homogeneous and isotropic? 
How can the Cosmic Microwave Background (CMB) be construed as evidence for homogeneity and 
isotropy given that it provides information only over a 2D surface on the sky? 
What is thermodynamic equilibrium? What evidence suggests that the early Universe was in thermo- 
dynamic equilibrium? 
. What are cosmological parameters? 
. What cosmological parameters can or cannot be measured from the power spectrum of fluctuations of 
the CMB? 
. Friedmann-Lemaitre-Robertson-Walker (FLRW) universes are characterized as closed, flat, or open. 
Does flat here mean the same as flat Minkowski space? 
. What is it that astronomers call dark matter? 
. What is the primary evidence for the existence of non-baryonic cold dark matter? 
. How can astronomers detect dark matter in galaxies or clusters of galaxies? 
. How can cosmologists claim that the Universe is dominated by not one but two distinct kinds of myste- 
rious mass-energy, dark matter and dark energy, neither of which has been observed in the laboratory? 
. What key property or properties distinguish dark energy from dark matter? 
. A FLRW universe conserves entropy. Is that true? If so, can the entropy of the Universe increase? 
. Does the annihilation of electron-positron pairs into photons generate entropy in the early Universe, as 
its temperature cools through 1 MeV? 
How does the wavelength of light change with the expansion of the Universe? 
How does the temperature of the CMB change with the expansion of the Universe? 
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23. 
24. 
25. 
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27. 


Concept Questions 225 


How does a blackbody (Planck) distribution change with the expansion of the Universe? What about a 
non-relativistic distribution? What about a semi-relativistic distribution? 

What is the horizon of our Universe? What is the Hubble distance? 

What happens beyond the horizon of our Universe? 

What caused the Big Bang? 

What happened before the Big Bang? 

What will be the fate of the Universe? 


What’s important? 


. The Cosmic Microwave Background (CMB) indicates that the early (~ 400,000 year old) Universe was 
(a) uniform to a few x10~°, and (b) in thermodynamic equilibrium. This indicates that 

the Universe was once very simple | . 
It is this simplicity that makes it possible to model the early Universe with some degree of confidence. 
. The power spectrum of fluctuations of the CMB has enabled precise measurements of cosmological 
parameters. 
. There is a remarkable concordance of evidence from a broad range of astronomical observations — 
supernovae, big bang nucleosynthesis, the clustering of galaxies, the abundances of clusters of galaxies, 
measurements of the Hubble constant from Cepheid variables and supernovae, and the ages of the oldest 
stars. 
. Observational evidence is consistent with the predictions of the theory of inflation in its simplest form 
— the expansion of the Universe, the spatial flatness of the Universe, the near uniformity of temperature 
fluctuations of the CMB (the horizon problem), the presence of acoustic peaks and troughs in the power 
spectrum of fluctuations of the CMB, the near power law shape of the power spectrum at large scales, 
its spectral index (tilt), the gaussian distribution of fluctuations at large scales. 
. What is non-baryonic dark matter? 
. What is dark energy? What is its equation of state w = p/p, and how does w evolve with time? 
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Homogeneous, Isotropic Cosmology 


10.1 Observational basis 


Since 1998, observations have converged on a Standard “ACDM” Model of Cosmology, a spatially flat 
Universe dominated by gravitationally repulsive dark energy whose equation of state is consistent with that 
of a cosmological constant (A), and by gravitationally attractive non-baryonic cold dark matter (CDM). The 
mass-energy of the Standard Model of the Universe consists of 70% dark energy, 25% non-baryonic cold dark 
matter (CDM), 5% baryonic matter, and a sprinkling of photons and neutrinos. The designation “baryonic” 
is conventional but misleading: it refers to all atomic matter, including not only baryons (nuclei), but also 
non-relativistic charged leptons (electrons). 


10.1.1 The expansion of the Universe 


The Hubble diagram, a diagram of distance versus redshift of distant astronomical objects, indicates that 
the Universe is expanding. 

Hubble’s law states that galaxies are receding with velocity proportional to distance, v = Hod, with 
constant of proportionality the Hubble constant Ho (the 0 subscript signifies the present day value). Hubble’s 
law was first proposed by Georges Lemaitre (1927) and by Edwin Hubble (1929) on the basis of observations. 

The recession velocity v of an astronomical object can be determined with some precision from the redshift 
of its spectral lines, but its distance d is more difficult to measure, because astronomical objects, such as 
galaxies, typically have a wide range of intrinsic luminosities. Hubble estimated distances to galaxies using 
Cepheid variable stars, which had been discovered by Henrietta Leavitt (1912) to have periods proportional 
to their luminosities. A good distance estimator should be a “standard candle” of predictable luminosity, and 
it should be bright, so that it can be seen over cosmological distances. 

The best modern Hubble diagram is that of Type Ia supernovae, illustrated in Figure 10.1, from data 
tabulated by Scolnic et al. (2018). A Type Ia supernova is thought to represent the thermonuclear explosion 
of a white dwarf star that through accretion from a companion star reaches the Chandrasekhar mass limit of 
1.4Mo. Having a similar origin, such supernovae approximate standard candles (or standard bombs) having 
the same luminosity. Actually, some variation in luminosity is observed, which may be associated with the 
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Figure 10.1 Hubble diagram of 1048 Type Ia supernovae from a compilation of surveys, from data tabulated by Scolnic 
et al. (2018). The vertical axis is the luminosity distance dz in units of the present-day Hubble distance c/Ho. The 
bottom panel shows residuals. The various smooth curves are 5 theoretical model Hubble diagrams, with parameters 
as indicated. The solid line is a flat ACDM model with Q, = 0.7 and Qm = 0.3. 


amount of °®Ni synthesized in the explosion, and which can be corrected at least in part through an empirical 
relation between luminosity and how rapidly the lightcurve decays (higher luminosity supernovae decay more 
slowly). 
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10.1.2 The acceleration of the Universe 


Since light takes time to travel from distant parts of the Universe to astronomers here on Earth, the higher 
the redshift of an object, the further back in time astronomers are seeing. 

In 1998 two teams, the Supernova Cosmology Project (Perlmutter, S. et al. (Supernova Cosmology Project, 
32 authors), 1999), and the High-z Supernova Search team (Riess et al., 1998), precipitated the revolution 
that led to the Standard Model of Cosmology. They reported that observations of Type Ia supernova at high 
redshift indicated that the Universe is not only expanding, but also accelerating. The acceleration requires 
the mass-energy density of the Universe to be dominated at the present time by a gravitationally repulsive 
component, such as a cosmological constant A. 

In the Hubble diagram of Type Ia supernova shown in Figure 10.1, the fitted curve is a best-fit flat 
cosmological model containing a cosmological constant and matter. 


10.1.3 The Cosmic Microwave Background (CMB) 


The single most powerful observational constraints on the Universe come from the Cosmic Microwave Back- 
ground (CMB). Modern observations of the CMB have ushered in an era of precision cosmology, where key 
cosmological parameters are being measured with percent level uncertainties. 

The CMB was discovered serendipitously by Arno Penzias & Robert Wilson (1965), who were puzzled 
by an apparently uniform excess temperature from a horn antenna, 6 metres in size, tuned to a wavelength 
of ~ 7cm, that they had built to detect radio waves. They were unaware that Robert Dicke’s group at 
Princeton had already realised that a hot Big Bang would have left a remnant of blackbody radiation filling 
the Universe, with a present-day temperature of a few Kelvin, and were setting about to try to detect it. 
When Penzias heard about Dicke’s work, he and Wilson quickly realised that their observations fit what 
the Princeton group were predicting. The observations of Penzias and Wilson (1965) were published along 
with a theoretical explanation by Dicke et al. (1965) in back-to-back papers in an issue of the Astrophysical 
Journal Letters. 

Dicke et al. (1965) argued that the temperature of the expanding Universe must have been higher in the 
past, and there must have been a time before which the temperature was high enough to ionize hydrogen, 
about 3,000 K. Before this time, called recombination, hydrogen and other elements would have been mostly 
ionized. The CMB comes to us from the time of recombination, when the Universe transitioned from being 
mainly ionized, and therefore opaque, to being mainly neutral, and therefore transparent. Recombination 
occurred when the Universe was about 400,000 years old, and the CMB has streamed essentially freely 
through the Universe since that time. Thus the CMB provides a snapshot of the Universe at recombination. 

The CMB spectrum peaks in microwaves, which are absorbed by water vapour in the atmosphere. Modern 
observations of the CMB are therefore made using satellites, or with balloons, or at high-altitude sites with 
low water vapour, such as the South Pole, or the Atacama Desert in Chile. 

The characteristics of the CMB measured from modern observations are as follows. 

The CMB has a remarkably precise black body spectrum, Figure 10.2, with temperature (Fixsen, 2009) 


To = 2.72548 + 0.00057 K . (10.1) 
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Figure 10.2 COBE/FIRAS observations of the (monopole) spectrum of the CMB. The observations (points with error 
bars multiplied by 500) fit extraordinarily well to a blackbody, or Planck, spectrum at a temperature of 2.725 K (solid 
line). In practice, the spectrum was observed by switching between the CMB sky and a blackbody calibrator. The 
lower graph shows the measured deviation from the blackbody calibrator. Data from https://lambda.gsfc.nasa.gov/ 
product/cobe/firas_ monopole get.cfm. 


The CMB shows a dipole anisotropy of AT = 3.355 + 0.008 mK, implying that the solar system is moving 
through the CMB at. velocity (Jarosik et al., 2011) 


v = 369.1kms~* in Galactic coordinate direction {1, b} = {263°99 + 0.14, 48°26 + 0.03} . (10.2) 


After dipole subtraction, the temperature of the CMB over the sky is uniform to a few parts in 10°. 

The power spectrum of temperature T fluctuations shows a scale-invariant spectrum at large scales, and 
prominent acoustic peaks at smaller scales, Figure 10.3. The power spectrum fits astonishingly well to 
predictions based on the theory of inflation, §10.22, in its simplest form. The power spectrum yields precision 
measurements of some basic cosmological parameters, notably the densities of the principal contributions to 
the energy-density of the Universe: dark energy, non-baryonic cold dark matter, and baryons. 

Fluctuations in the CMB are expected to be polarized at some level. There are two independent modes of 
polarization of opposite parity, electric “E” ((—)* parity) modes and magnetic “B” ((—)*t1 
There are corresponding E-mode and B-mode power spectra. The temperature fluctuation T has electric 
parity, so of the cross-power spectra between temperature T and E and B fluctuations, only the T—E cross- 
power is expected to be non-vanishing (if the Universe at large is not only homogeneous but also parity 


parity) modes. 
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Figure 10.3 Power spectrum of fluctuations in the CMB from observations with the Planck satellite (Ade et al., 2013), 
WMAP (Hinshaw et al., 2012), the Atacama Cosmology Telescope (Das et al., 2011), and the South Pole Telescope 
(Keisler et al., 2011). The plot is logarithmic in harmonic number / up to 100, linear thereafter. The fit is a best-fit 


flat ACDM model. 


symmetric). The T—E cross-power spectrum has been measured by the WMAP satellite, and is interpreted 
as arising from scattering of CMB photons by ionized gas intervening between recombination and us. 


10.1.4 The clustering of galaxies 


The clustering of galaxies shows a power spectrum in good agreement with the Standard Model, Figure 10.4. 

Historically, the principal evidence for non-baryonic cold dark matter was comparison between the power 
spectra of galaxies versus CMB. How can tiny fluctuations in the CMB grow into the observed fluctuations 
in matter today in only the age of the Universe? The answer was, non-baryonic dark matter that begins to 
cluster before recombination, when the CMB was released. 

The interpretation of the power spectrum of galaxies is complicated by the facts that galaxies have un- 
dergone non-linear clustering at smaller scales, and that galaxies are a biassed tracer of mass. However, the 
pattern of clustering at large, linear scales retains an imprint of baryonic acoustic oscillations (BAO) analo- 
gous to those in the CMB. Observations from large galaxy surveys have been able to measure the predicted 
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Figure 10.4 Monopole, dipole, and quadrupole power spectra of galaxies from the extended Baryon Oscillation Spec- 
troscopic Survey (BOSS) of the Sloan Digital Sky Survey IV (SDSS-IV) (Gil-Marin et al., 2020). The analysis includes 
377,458 luminous red galaxies covering approximately 18% of the sky over redshifts z = 0.6-1.0. Filled and unfilled 
symbols are measurements from the north and south galactic caps respectively. The curves are flat ACDM model 
spectra calculated from simulations analyzed using the same footprint as the survey. The right panel shows spectra 
with a smooth component divided out to bring out the baryon acoustic oscillations (BAO). 


BAO, Figure 10.4. Comparison of the scales of acoustic oscillations in galaxies and the CMB allows the 
two scales to be matched, pinning the relative scales of galaxies today with those in the CMB at redshift 


z ~ 1100. 


Major plans are underway to measure galaxy clustering as a function of redshift, with the primary aim to 
determine whether the evolution of dark energy is consistent with that of a cosmological constant. Such a 
measurement cannot be done with CMB observations, since the CMB offers only a snapshot of the Universe 


at high redshift. 


10. 


1.5 Other supporting evidence 


e The observed abundances of light elements H, D, 3He, He, and Li are consistent with the predictions of 
big bang nucleosynthesis (BBN) provided that the baryonic density is Q» + 0.04, in good agreement with 


measurements from the CMB. 


e The ages of the oldest stars, in globular clusters, agree with the age of the Universe with dark energy, but 
are older than the Universe without dark energy. 


e The existence of dark matter, possibly non-baryonic, is supported by ubiquitous evidence for unseen dark 
matter, deduced from sizes and velocities (or in the case of gravitational lensing, the gravitational potential) 


of various objects: 
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The Local Group of galaxies; 
Rotation curves of spiral galaxies; 


— The temperature and distribution of x-ray gas in elliptical galaxies, and in clusters of galaxies; 
— Gravitational lensing by clusters of galaxies. 

e The abundance of galaxy clusters as a function of redshift is consistent with a matter density Qm œ% 0.3, but 
not much higher. A low matter density slows the gravitational clustering of galaxies, implying relatively 
more and richer clusters at high redshift than at the present, as observed. 

e The Bullet cluster is a rare example that supports the notion that the dark matter is non-baryonic. In the 
Bullet cluster, two clusters recently passed through each other. The baryonic matter, as measured from 
x-ray emission of hot gas, appears displaced from the dark matter, as measured from weak gravitational 
lensing. 


10.2 Cosmological Principle 


The cosmological principle states that the Universe at large is 

e homogeneous (has spatial translation symmetry), 

e isotropic (has spatial rotation symmetry). 

The primary evidence for this is the uniformity of the temperature of the CMB, which, after subtraction of 

the dipole produced by the motion of the solar system through the CMB, is constant over the sky to a few 

parts in 10°. Confirming evidence is the statistical uniformity of the distribution of galaxies over large scales. 
The cosmological principle allows that the Universe evolves in time, as observations surely indicate — the 

Universe is expanding, galaxies, quasars, and galaxy clusters evolve with redshift, and the temperature of 

the CMB has undoubtedly decreased since recombination. 


10.3 Friedmann-Lemaitre-Robertson-Walker metric 


Universes satisfying the cosmological principle are described by the Friedmann-Lemaitre-Robertson- Walker 
(FLRW) metric, equation (10.28) below, discovered independently by Friedmann (1922; 1924) and Lemaitre 
(1927) (English translation in Lemaitre 1931). The FLRW metric was shown to be the unique metric for a 
homogeneous, isotropic universe by Robertson (1935; 1936; 1936) and Walker (1937). The metric, and the 
associated Einstein equations, which are known as the Friedmann equations, are set forward in the next 
several sections, §§10.4-10.9. 


10.4 Spatial part of the FLRW metric: informal approach 


The cosmological principle implies that 


the spatial part of the FLRW metric is a 3D hypersphere | . (10.3) 
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In this context the term hypersphere is to be construed as including not only cases of positive curvature, 
which have finite positive radius of curvature, but also cases of zero and negative curvature, which have 
infinite and imaginary radius of curvature. 

Figure 10.5 shows an embedding diagram of a 3D hypersphere in 4D Euclidean space. The horizontal 
directions in the diagram represent the normal 3 spatial x, y, z dimensions, with one dimension z suppressed, 
while the vertical dimension represents the 4th spatial dimension w. The 3D hypersphere is a set of points 


{x,y,z, w} satisfying 
(x? +y? +z? + w?) Wks R = constant . (10.4) 


An observer is sitting at the north pole of the diagram, at {0,0,0,1}. A 2D sphere (which forms a 1D circle 
in the embedding diagram of Figure 10.5) at fixed distance surrounding the observer has geodesic distance 
ry defined by 


rj = proper distance to sphere measured along a radial geodesic , (10.5) 


and circumferential radius r defined by 


r= (r? +y + eye ; (10.6) 


Figure 10.5 Embedding diagram of the FLRW geometry. 
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which has the property that the proper circumference of the sphere is 27r. In terms of ry and r, the spatial 
metric is 


dl? = drî + rdo? , (10.7) 


where do? = dé? + sin”6 d¢? is the metric of a unit 2-sphere. 
Introduce the angle x illustrated in the diagram. Evidently 


ry = Ry , 
r=Rsiny. (10.8) 
In terms of the angle x, the spatial metric is 
dl? = R? (dx? + sin*y do’) , (10.9) 


which is one version of the spatial FLRW metric. The metric resembles the metric of a 2-sphere of radius R, 
which is not surprising since the same construction, with Figure 10.5 interpreted as the embedding diagram 
of a 2D sphere in 3D, yields the metric of a 2-sphere. Indeed, the construction iterates to give the metric of 
an N-dimensional sphere of arbitrarily many dimensions N. 

Instead of the angle x, the metric can be expressed in terms of the circumferential radius r. It follows from 
equations (10.8) that 


r = Rasin(r/R) , (10.10) 
whence 
ie dr 
= eer 
s (10.11) 


V1 — Kr?” 


where K is the curvature 


1 
Kaa: (10.12) 
In terms of r, the spatial FLRW metric is then 
d 2 
d? =~ eee +r?do? |. (10.13) 


The embedding diagram Figure 10.5 is a nice prop for the imagination, but it is not the whole story. The 
curvature K in the metric (10.13) may be not only positive, corresponding to real finite radius R, but also 
zero or negative, corresponding to infinite or imaginary radius R. The possibilities are called closed, flat, and 
open: 

>0 closed R real , 
K =0 flat Row, (10.14) 
<0 open R imaginary. 
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10.5 Comoving coordinates 


The metric (10.13) is valid at any single instant of cosmic time t. As the Universe expands, the 3D spa- 
tial hypersphere (whether closed, flat, or open) expands. In cosmology it is highly advantageous to work in 
comoving coordinates that expand with the Universe. Why? Firstly, it is helpful conceptually and math- 
ematically to think of the Universe as at rest in comoving coordinates. Secondly, linear perturbations, such 
as those in the CMB, have wavelengths that expand with the Universe, and are therefore fixed in comoving 
coordinates. 

In practice, cosmologists introduce the cosmic scale factor a(t) 


a(t) = measure of the size of the Universe, expanding with the Universe , (10.15) 


which is proportional to but not necessarily equal to the radius R of the Universe. The cosmic scale factor 
a can be normalized in any arbitrary way. The most common convention adopted by cosmologists is to 
normalize it to unity at the present time, 


aj =1, (10.16) 


where the 0 subscript conventionally signifies the present time. 
Comoving geodesic and circumferential radial distances xj and x are defined in terms of the proper geodesic 
and circumferential radial distances rj and r by 


ax, ET], ax=r. (10.17) 
Objects expanding with the Universe remain at fixed comoving positions x) and x. In terms of the comoving 


circumferential radius x, the spatial FLRW metric is 


d 2 
di? = a? (= m ado?) (10.18) 


1— kg 


where the curvature constant «, a constant in time and space, is related to the curvature K, equation (10.12), 
by 
k=a@K. (10.19) 


Alternatively, in terms of the geodesic comoving radius x), the spatial FLRW metric is 
dl? = a? (ax? + z?do?) ‘ (10.20) 


where 
sin(K1/?2)) 
pi 
r= ¢ TI K=0 flat, (10.21) 
sinh(|«|!/22) 
int? 


k>0 closed, 


k<0 open. 
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Actually it is fine to use just the top expression of equations (10.21), which is mathematically equivalent to 
the bottom two expressions when « = 0 or «K < 0 (because sin(ix)/i = sinh(«)). 

For some purposes it is convenient to normalize the cosmic scale factor a so that k = 1, 0, or —1. In this 
case the spatial FLRW metric may be written 


dl? =a? (dy? +.27do*) , (10.22) 


where 


T= X k=0 flat, (10.23) 


10.6 Spatial part of the FLRW metric: more formal approach 


A more formal approach to the derivation of the spatial FLRW metric from the cosmological principle starts 
with the proposition that the spatial components Gag of the Einstein tensor at fixed scale factor a (all time 
derivatives of a set to zero) should be proportional to the metric tensor 


Gag =—K gag (a,b =1,2,3). (10.24) 
Without loss of generality, the spatial metric can be taken to be of the form 
dl? = f(r) dr? + r7do? . (10.25) 


Imposing the condition (10.24) on the metric (10.25) recovers the spatial FLRW metric (10.13). 


Exercise 10.1. Isotropic (Poincaré) form of the FLRW metric. By a suitable transformation of the 
comoving radial coordinate x, bring the spatial FLRW metric (10.18) to the “isotropic” form 


4a? 


dl? = : 
(1+ «X2) 


(dX? + X*do?) . (10.26) 


What is the relation between X and x? 

For an open geometry, k < 0, the isotropic line-element (10.26) is also called the Poincaré ball, or in 2D the 
Poincaré disk, Figure 10.6. By construction, the isotropic line-element (10.26) is conformally flat, meaning 
that it equals the Euclidean line-element multiplied by a position-dependent conformal factor. Conformal 
transformations of a line-element preserve angles. 

Solution. 


X 


me an ( 5 ) » t= Te Pes cha) ( ) 


For an open geometry with k = —1, X goes from 0 to 1 as x goes from 0 to oo. 
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Figure 10.6 The Poincaré disk depicts the geometry of an open FLRW universe in isotropic coordinates. The lines are 
lines of latitude and longitude relative to a “pole” chosen here to be displaced from the centre of the disk. In isotropic 
coordinates, geodesics correspond to circles that intersect the boundary of the disk at right angles, such as the lines 
of constant longitude in this diagram. The lines of latitude remain unchanged under rotations about the pole. 


10.7 FLRW metric 


The full Friedmann-Lemaitre-Robertson-Walker spacetime metric is 


——_. + odo") ; (10.28) 


where t is cosmic time, which is the proper time experienced by comoving observers, who remain at rest 
in comoving coordinates dx = d0 = dd = 0. Any of the alternative versions of the comoving spatial FLRW 
metric, equations (10.18), (10.20), (10.22), or (10.26), may be used as the spatial part of the FLRW spacetime 
metric (10.28). 


10.8 Einstein equations for FLRW metric 


The Einstein equations for the FLRW metric (10.28) are 


22 
t a K _ 
—-G, =3 (4 + £) = 8Gp, (10.29a) 
24 à K 
© 0 0 


where overdots represent differentiation with respect to cosmic time t, so that for example å = da/dt. Note 
the trick of one index up, one down, to remove, modulo signs, the distorting effect of the metric on the 
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Einstein tensor. The Einstein equations (10.29) rearrange to give Friedmann’s equations 


å? 8Gp r 

am aV (10.30a) 
a 4nG 

= = = (o + 3p) |. (10.30b) 


Friedmann’s two equations (10.30) are fundamental to cosmology. The first one relates the curvature « of 
the Universe to the expansion rate a/a and the density p. The second one relates the acceleration ä/a to the 
density p plus 3 times the pressure p. 


10.9 Newtonian “derivation” of Friedmann equations 


The Friedmann equations can be reproduced with a heuristic Newtonian argument. 


10.9.1 Energy equation 


Model a piece of the Universe as a ball of radius a with uniform density p, hence of mass M = $mpa°. 
Consider a small mass m attracted by this ball. Conservation of the kinetic plus potential energy of the 
small mass m implies 


1. GMm kme 
må? = : (10.31) 
2 a 2 
where the quantity on the right is some constant whose value is not determined by this Newtonian treatment, 


but which GR implies is as given. The energy equation (10.31) rearranges to 


a? _ 87Gp KC? 


a2 3 a? |’ 


(10.32) 


m 


Figure 10.7 Newtonian picture in which the Universe is modeled as a uniform density sphere of radius a and mass M 
that gravitationally attracts a test mass m. 
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which reproduces the first Friedmann equation. 


10.9.2 First law of thermodynamics 
For adiabatic expansion, the first law of thermodynamics is 
dE +pdV =0. (10.33) 
With E = pV and V = $ra’, the first law (10.33) becomes 
d(pa®) + pda? =0, (10.34) 
or, with the derivative taken with respect to cosmic time t, 
p+ 3(0-+p)- =0. (10.35) 
Differentiating the first Friedmann equation in the form 


12 87G'pa* 2 


a 3. 7 fe (10.36) 
gives 
2åä = me (pa? + 2paa) , (10.37) 
and substituting 6 from the first law (10.35) reduces this to 
2åä = O ai (—p— 3p) . (10.38) 
Hence 
= -2e PEETI (10.39) 


which reproduces the second Friedmann equation. 


10.9.3 Comment on the Newtonian derivation 


The above Newtonian derivation of Friedmann’s equations is only heuristic. A different result could have 
been obtained if different assumptions had been made. If for example the Newtonian gravitational force law 
mä = —GMm/a? were taken as correct, then it would follow that ä/a = —inGp, which is missing the 
all-important 3p contribution (without which there would be no inflation or dark energy) to Friedmann’s 
second equation. 

It is notable that the first law of thermodynamics is built in to the Friedmann equations. This implies 
that entropy is conserved in FLRW Universes (but see Concept question 30.5). This remains true even when 
the mix of particles changes, as happens for example during the epoch of electron-positron annihilation, or 
during big bang nucleosynthesis. How then does entropy increase in the real Universe? Through fluctuations 
away from the perfect homogeneity and isotropy assumed by the FLRW metric. 
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10.10 Hubble parameter 


The Hubble parameter H(t) is defined by 


H=“|. (10.40) 
a 


The Hubble parameter H varies in cosmic time t, but is constant in space at fixed cosmic time t. 

The value of the Hubble parameter today is called the Hubble constant Ho (the subscript 0 signifies the 
present time). The Hubble constant measured from Cepheid variable stars and Type Ia supernova is (Riess 
et al., 2011; Riess et al., 2018). 


Ho = 73.5+1.7 kms~! Mpc’ . (10.41) 


The observed CMB power spectrum, Figure 10.3, provides an accurate measurement of the angular lo- 
cation of the first peak in the power spectrum, which determines the angular size of the sound horizon at 
recombination, Chapter 32. This cosmological yardstick translates into a measurement of the Hubble pa- 
rameter Ho, but only if a cosmological model is assumed. In particular, the angular location of the peak 
depends on the spatial curvature. The combination of CMB data with other data, notably Baryon Acoustic 
Oscillations in galaxy clustering, Figure 10.4, and the Hubble diagram of Type Ia supernovae, Figure 10.1, 
point consistently to a spatially flat cosmological model. If the Universe is taken to be spatially flat, then 
CMB data from the Planck satellite yield (Aghanim et al., 2018) 


Ho = 67.4 + 0.5 kms~! Mpc’ . (10.42) 


The Cepheid and CMB measurements (10.41) and (10.42) of Ho lie outside each other’s error bars. One can 
either be impressed that two completely independent measurements of Ho yield almost the same result, or 
be worried by the disagreement. I incline to the former view, since these kind of measurements tend to be 
beset with systematic uncertainties that can be difficult to get under control. 

The distance d to an object that is receding with the expansion of the universe is proportional to the cosmic 
scale factor, d x a, and its recession velocity v is consequently proportional to a. The result is Hubble’s 
law relating the recession velocity v and distance d of distant objects 


l (10.43) 


Since it takes light time to travel from a distant object, and the Hubble parameter varies in time, the linear 
relation (10.43) breaks down at cosmological distances. 

We, in the Milky Way, reside in an overdense region of the Universe that has collapsed out of the general 
Hubble expansion of the Universe. The local overdense region of the Universe that has just turned around 
from the general expansion and is beginning to collapse for the first time is called the Local Group of 
galaxies. The Local Group consists of order 100 galaxies, mostly dwarf and irregular galaxies. It contains two 
major spiral galaxies, Andromeda (M31) and the Milky Way, and one mid-sized spiral galaxy Triangulum 
(M33). The Local Group is about 1 Mpc in radius. 
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Because of the ubiquity of the Hubble constant in cosmological studies, cosmologists often parameterize 
it by the quantity h defined by 
H, 
h= 3 = 
100 kms~! Mpc 


(10.44) 


The reciprocal of the Hubble constant gives an approximate estimate of the age of the Universe (c.f. Exer- 
cise 10.6), 


1 
y= 9-778 h`! Gyr = 14.0 ho ło Gyr . (10.45) 
0 


10.11 Critical density 


The critical density perit is defined to be the density required for the Universe to be flat, k = 0. According 
to the first of Friedmann equations (10.30), this sets 


3H? 


Perit = 81G (10.46) 


The critical density perit, like the Hubble parameter H, evolves with time. 


10.12 Omega 


Cosmologists designate the ratio of the actual density p of the Universe to the critical density perit by the 
fateful letter Q, the final letter of the Greek alphabet, 


p 
Perit 


Q= (10.47) 


With no subscript, Q denotes the total mass-energy density in all forms. A subscript x on 0, denotes 
mass-energy density of type zx. 

The curvature density pk, which is not really a form of mass-energy but it is sometimes convenient to treat 
it as though it were, is defined by 


3k 
= — — 1 A 
Pk 8nGaz , ( 0 8) 
and correspondingly 
2 
Pk KC 
Qk = = i 10.49 
. Perit a? H? ( ) 


If the cosmic scale factor is normalized to unity at the present time, equation (10.16), then the relation 
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Table 10.1: Cosmic inventory 


WMAP Planck 
Species Hinshaw et al. (2012) Aghanim et al. (2018) 
Dark energy (A) Qa 0.72 + 0.01 0.685 + 0.007 
Non-baryonic cold dark matter (CDM) Qe 0.24 + 0.01 0.261 + 0.002 
Baryonic matter Qb 0.047 + 0.002 0.0490 + 0.0005 
Neutrinos Q, < 0.02 < 0.004 
Photons (CMB) Qy 5 x 1075 5 x 1075 
Total Q 1.003 + 0.004 0.999 + 0.002 
Curvature Qk —0.003 + 0.004 0.001 + 0.002 
between 0; and the curvature constant « is Q, = —Kc?/ HÈ. According to the first of Friedmann’s equa- 
tions (10.30), the curvature density Qy satisfies 
O,=1-9). (10.50) 


Note that Q% has opposite sign from «, so a closed universe has negative Qg. 

Table 10.1 gives measurements of 2 in various species, as reported by Hinshaw et al. (2012) from the final 
analysis of the CMB power spectrum from WMAP, and by Aghanim et al. (2018) from the final analysis of 
the CMB power spectrum from Planck. Both sets of analyses incorporate measurements from a variety of 
other data, including CMB data at smaller scales, Figure 10.3, supernova data, Figure 10.1, galaxy clustering 
(Baryonic Acoustic Oscillation, or BAO) data, Figure 10.4, and local measurements of the Hubble constant 
Ho (Riess et al., 2011; Riess et al., 2018). It is largely the CMB data that enable cosmological parameters 
to be measured to the level of precision given in the Table. However, the CMB data by themselves constrain 
tightly only a combination of the Hubble parameter Ho and the curvature Q x, as illustrated in Figure 26 
of Aghanim et al. (2018). Other data, in particular BAO and the supernova Hubble diagram, resolve this 
uncertainty, pointing to a flat Universe, Qg = 0. Importantly, the various data are consistent with each other, 
inspiring confidence in the correctness of the Standard Model. The neutrino limit implies an upper limit to 
the sum of the masses of all neutrino species (Aghanim et al., 2018), 


So my <0.12eV . (10.51) 


Exercise 10.2. Omega in photons. Most of the energy density in electromagnetic radiation today is in 
CMB photons. Calculate Q, in CMB photons. Note that photons may not be the only relativistic species 
today. Neutrinos with masses smaller than about 107+ eV would be still be relativistic at the present time, 
Exercise 10.20. 

Solution. CMB photons have a blackbody spectrum at temperature Tọ = 2.725 K, so their density can be 
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calculated from the blackbody formula. The present day ratio Q, of the mass-energy density p, of CMB 
photons to the critical density perit is 
py _ 8nGpy — 803G(kT)4 


1350 = BH? BRek = 2.471 x 107% h? Tee = 5.0 x 107" hozo loc (10.52) 
cri 0 0 


Q 


10.13 Types of mass-energy 


The energy-momentum tensor T),, of a FLRW Universe is necessarily homogeneous and isotropic, by as- 
sumption of the cosmological principle, taking the form (note yet again the trick of one index up and one 
down to remove the distorting effect of the metric) 


Tt 0 0 0 —p 0 0 0 
0 T 0 0 0 po 0 

$ r — 

T=! 9 9 TP Oo | | 0 0 pO Ores) 
0 0 0 Tŷ 0 00 p 


Table 10.2 gives equations of state p/p for generic species of mass-energy, along with (p + 3p)/p, which 
determines the gravitational attraction (deceleration) per unit energy, and how the mass-energy varies with 
cosmic scale factor, p x a”, Exercise 10.3. 

As commented in §10.9.2, the first law of thermodynamics for adiabatic expansion is built into Friedmann’s 
equations. In fact the law represents covariant conservation of energy-momentum for the system as a whole 


D,T™” =0. (10.54) 


As long as species do not convert into each other (for example, no annihilation), covariant energy-momentum 
conservation holds individually for each species, so the first law applies to each species individually, deter- 
mining how its energy density p varies with cosmic scale factor a. Figure 10.8 illustrates how the energy 
densities p of various species evolve as a function of scale factor a. 

Vacuum energy is equivalent to a cosmological constant. Einstein originally introduced the cosmological 


Table 10.2: Properties of universes dominated by various species 


Species p/e (p+3p)/p_ px 
Radiation 1/3 2 a~t 
Matter 0 1 a” 
Curvature “—1/3” “o” a? 


Vacuum —1 —2 a 
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Figure 10.8 Behaviour of the mass-energy density p of various species as a function of cosmic time t. 


constant A as a modification to the left hand side of his equations, 
Gru + Agru = 8AGT, - (10.55) 


The cosmological constant term can be taken over to the right hand side and reinterpreted as vacuum energy 


Tku = —PA Iru With energy density pa, satisfying 
EE (10.56) 


Exercise 10.3. Mass-energy in a FLRW Universe. 
1. First law. The first law of thermodynamics for adiabatic expansion is built into Friedmann’s equations 


(= Einstein’s equations for the FLRW metric): 

d(pa*) + pda? =0. (10.57) 
How does the density p evolve with cosmic scale factor for a species with equation of state p/p = w with 
constant w? You should get an answer of the form 


pxa”. (10.58) 


2. Attractive or repulsive? For what equation of state w is the mass-energy attractive or repulsive? 


Consider in particular the cases of “matter,” “radiation,” “curvature,” and “vacuum” energy. 
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Concept question 10.4. Mass of a ball of photons or of vacuum. What is the gravitational mass of 
a homogeneous, isotropic, spherical, ball of photons embedded in empty space, as measured by an observer 
outside the ball? Assume that the boundary of the ball is free to expand or contract. What if the ball of 
photons is bounded by a stationary reflecting spherical wall? What if the ball is a ball of vacuum energy 
instead of photons? Answer. The right way to address this question is to think about what happens at the 
boundary between the ball and empty space. See §20.17. 


10.14 Redshifting 


The spatial translation symmetry of the FLRW metric implies conservation of generalized momentum. As you 
will show in Exercise 10.5, a particle that moves along a geodesic in the radial direction, so that d0 = dé = 0, 
has 4-velocity p” satisfying 


Poy = constant . (10.59) 


This conservation law implies that the coordinate momentum p”! of a radially moving particle decays as 


sae P: 1 
pill = g”I®l pa, = — x=, (10.60) 
a a 


so the proper momentum (the momentum measured in a comoving tetrad frame) decays as 


x | dri dz 1 
Pproper = M = ma— = ap! x 7 j (10.61) 
which is true for both massive and massless particles. 


It follows from equation (10.61) that light observed on Earth from a distant object will be redshifted by 
a factor 


a 
E R (10.62) 

a 
where ao is the present day cosmic scale factor. Cosmologists often refer to the redshift of an epoch, since 


the cosmological redshift is an observationally accessible quantity that uniquely determines the cosmic time 
of emission. 


Exercise 10.5. Geodesics in the FLRW geometry. The Friedmann-Lemaitre-Robertson- Walker metric 
of cosmology is 


2 (1/2 
ds? = — dt? + a(t)? | da? + sin A) (ag? +. sin2@ dg?) | (10.63) 
K 


where « is a constant, the curvature constant. Note that equation (10.63) is valid for all values of «, including 
zero and negative values: there is no need to consider the cases separately. 
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1. Conservation of generalized momentum. Consider a particle moving with comoving 4-momentum 
p” = dz” /dX along a geodesic in the radial direction, so that dð = dé = 0. Argue that the Lagrangian 
equations of motion 


Rin = Im (10.64) 
with effective Lagrangian 

L = 5 Jup" p” (10.65) 
imply that 

Po, = constant . (10.66) 


Argue further from the same Lagrangian equations of motion that the assumption of a radial geodesic 
is valid because 


po = Pe =0 (10.67) 


is a consistent solution. [Hint: The metric g,,, depends on the coordinate xj. But for radial geodesics 
with p? = p® = 0, the possible contributions from derivatives of the metric vanish.] 

2. Proper momentum. Argue that a proper interval of distance measured by comoving observers along 
the radial geodesic is ada). Hence show from equation (10.68) that the proper momentum p”! of the 
particle relative to comoving observers (who are at rest in the FLRW metric) evolves as 

dx 1 


pl= may Xo (10.68) 


3. Redshift. What relation does your result (10.68) imply between the redshift 1+ z of a distant object 
observed on Earth and the expansion factor a since the object emitted its light? [Hint: Equation (10.68) 
is valid for massless as well as massive particles. Why?| 

4. Temperature of the CMB. Argue from the above results that the temperature T of the CMB evolves 
with cosmic scale factor as 


1 
TR (10.69) 
a 


10.15 Evolution of the cosmic scale factor 


Given how the energy density p of each species evolves with cosmic scale factor a, the first Friedmann 
equation then determines how the cosmic scale factor a(t) itself evolves with cosmic time t. If the Hubble 
parameter H = å/a is expressed as a function of cosmic scale factor a, then cosmic time t can be expressed 


da 
= —. 10. 
t IF (10.70) 


in terms of a as 
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The definition (10.46) of the critical density allows the Hubble parameter H to be written 


H Perit 
=] : 10.71 
Ho Perit (ao) ( ) 


The critical density perit is itself the sum of the densities p of all species including the curvature density, 


Perit = P+ X P- (10.72) 


species % 


For example, in the case that the density is comprised of radiation, matter, and vacuum, the critical density 
is 

Perit = Pr + Pm + Pk + PA ; (10.73) 
and equation (10.71) is 


H(t) 
Ho 
where Q, represents its value at the present time. For density comprised of radiation, matter, and vacuum, 
equation (10.74), the time t, equation (10.70), is 
i= 1 f da 
Ho J ayrat + Nma 3 FNAT FAA” 
which is an elliptic integral of the third kind. The elliptic integral simplifies to elementary functions in some 
cases relevant to reality, Exercises 10.6 and 10.7. 


If one single species in particular dominates the mass-energy density, then equation (10.75) integrates to 
give the results in Table 10.3. 


= J/0,a7-4 + Nma + Nka? +AA , (10.74) 


(10.75) 


Table 10.3: Evolution of cosmic scale factor in universes dominated by various species 


Dominant Species ax 


Radiation 41/2 
Matter 12/8 
Curvature t 

Vacuum eft 


10.16 Age of the Universe 


The present age to of the Universe since the Big Bang can be derived from equation (10.75) and cosmological 
parameters, Table 10.1. Aghanim et al. (2018) give the age of the Universe to be 


to = 13.80 + 0.02 Gyr . (10.76) 
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Figure 10.9 Cosmic scale factor as a function of time in universes with various Qm and Q4. 


Exercise 10.6. Age of a FLRW universe containing matter and vacuum. 
1. Age of a universe dominated by matter and vacuum. To a good approximation, the Universe 
today appears to be flat, and dominated by matter and a cosmological constant, with Om + Qa = 1. 

Show that in this case the relation between age t and cosmic scale factor a is 


2 Naa’ 
t = ——— asinh , / ——_ . 10.77 
TAON asin T ( ) 


2. Age of our Universe. Evaluate the age to of the Universe today (a9 = 1) in the approximation that 
the Universe is flat and dominated by matter and a cosmological constant. [Note: Astronomers define 
one Julian year to be exactly 365.25 days of 24 x 60 x 60 = 86,400 seconds each. A parsec (pc) is the 
distance at which a star has a parallax of 1 arcsecond, whence 1pc = (60 x 60 x 180/7) au, where 
lau is one Astronomical Unit, the Earth-Sun distance. One Astronomical Unit was officially defined 
by the International Astronomical Union (IAU) in 2012 to be lau = 149,597,870,700 m, with official 
abbreviation au.| 


Exercise 10.7. Age of a FLRW universe containing radiation and matter. The Universe was 
dominated by radiation and matter over many decades of expansion including the time of recombination. 
Show that for a flat Universe containing radiation and matter the relation between age t and cosmic scale 


250 Homogeneous, Isotropic Cosmology 


factor a is 


203/? a2(2+ VI +a) 


t= ; 10.78 
3HoN2, (1+ V1 +4)? ( ) 
where â is the cosmic scale factor scaled to 1 at matter-radiation equality, 
a Oma 
â = = = 10.79 
a teq Q ( ) 


You may well find a formula different from (10.78), but you should be able to recover the latter using the 
identity V1 +â — 1 = â/(v1 +â + 1). Equation (10.78) has the virtue that it is numerically stable to 
evaluate for all â, including tiny â. 


10.17 Conformal time 
It is often convenient to use conformal time 7 defined by (with units c temporarily restored) 
adn = cdt, (10.80) 


with respect to which the FLRW metric is 


(10.81) 


ds? = a(n)” (- dn? + dai + x?do?) 


with x given by equation (10.21). The term conformal refers to a metric that is multiplied by an overall factor, 
the conformal factor (squared). In the FLRW metric (10.81), the cosmic scale factor a is the conformal factor. 

Conformal time 7 is constructed so that radial null geodesics move at unit velocity in conformal coordinates. 
Light moving radially, with dð = dọ = 0, towards an observer at the origin | = 0 satisfies 


dij 


—1. 10.82 
i, (10.82) 


Exercise 10.8. Relation between conformal time and cosmic scale factor. What is the relation 
between conformal time 7 and cosmic scale factor a if the energy-momentum is dominated by a species with 
equation of state p/p = w = constant? 
Solution. The conformal time 7 is related to cosmic scale factor a by (units c = 1) 

da 


= ——. , 10. 
n= | a (10.83) 


For p/p = w = constant, a possible choice of integration constant for 7 is: if w > —1/3 (decelerating), set 
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n = 0 at a = 0, so that 7 > œ at a > œ; if w < —1/3 (accelerating), set 7 = 0 at a > ov, so that 7 > —oo 
at a > 0. Then 


2 
= 4, (1+3w) /2 
= —_“~__ qx +4 10.84 
1 (1 +3w)aH ( ) 
in which the sign is positive for w > —1/3, negative for w < —1/3, ensuring that conformal time 7 always 
increases with cosmic scale factor a. For the special case of a curvature-dominated universe, w = —1/3, 
lna 
= — xl 10.85 
n= xlna, (10.85) 


which goes to 7 + —co as a —> 0 and 7 > œ as a —> o. 


10.18 Looking back along the lightcone 


Since light moves radially at unit velocity in conformal coordinates, an object at geodesic distance x) that 
emits light at conformal time Nem is observed at conformal time Nops given by 


TIl = Nobs — Nem | - (10.86) 


The comoving geodesic distance 2 to an object is 


Nobs tobs cdt Qobs cda Zz cdz 
a= an= f Ef ZH i H ’ (10.87 


em em 


where the last equation assumes the relation 1 + z = 1/a, valid as long as a is normalized to unity at the 
observer (us) at the present time dops = @o = 1. In the case that the density is comprised of (curvature and 
radiation, matter, and vacuum, equation (10.87) gives 


1 

d 

r= S I z , (10.88 
Ho 1/(1+z) a?/Q,a~4 + Nma’ + Qka? + QA 


which is an elliptical integral of the first kind. Given the geodesic comoving distance zy, the circumferential 
comoving distance x then follows as 
sinh(VQx Hox) /c) 


stale (10.89) 


To second order in redshift z, 
c 
[z 
Ho 


TxE 2? (Qr + 20m + Ek) +]. (10.90) 
The geodesic and circumferential distances x) and «x differ at order ae 

Figure 10.10 illustrates the relation between the comoving geodesic and circumferential distances x and 
x, equations (10.88) and (10.89), and redshift z, equation (10.62), in three cosmological models, including 
the standard flat ACDM model. 
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Figure 10.10 In this diagram, each wedge represents a cone of fixed opening angle, with the observer (us) at the point 
of the cone, at zero redshift. The wedges show the relation between physical sizes, namely the comoving distances «z| 
in the radial (vertical) and x in the transverse (horizontal) directions, and observable quantities, namely redshift and 


angular separation, in three different cosmological models: (left) a flat matter-dominated universe, (middle) an open 
matter-dominated universe, and (right) a flat ACDM universe. 


10.19 Hubble diagram 


The Hubble diagram of Type Ia supernova shown in Figure 10.1 is a plot of (log) luminosity distance log dz 
versus (log) redshift log z. The luminosity distance is explained in §10.19.1 immediately following. 


10.19.1 Luminosity distance 


Astronomers conventionally define the luminosity distance dy to a celestial object so that the observed 
flux F from the object (energy observed per unit time per unit collecting area of the telescope) is equal to 
the intrinsic luminosity L of the object (energy per unit time emitted by the object in its rest frame) divided 
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by 47d?, 
L 


Te (10.91) 
And? 


In other words, the luminosity distance dz is defined so that flux F and luminosity L are related by the 
usual inverse square law of distance. Objects at cosmological distances are redshifted, so the luminosity at 
some emitted wavelength Aem is observed at the redshifted wavelength Aobs = (1 + z)Aem- The luminosity 
distance (10.91) is defined so that the flux F'(Aops) on the left hand side is at the observed wavelength, while 
the luminosity D(Aem) on the right hand side is at the emitted wavelength. The observed flux and emitted 
luminosity are then related by 
L 

F= aan’ (10.92) 
where x is the comoving circumferential radius, normalized to ag = 1 at the present time. The factor of 
1/(4rx?) expresses the fact that the luminosity is spread over a sphere of proper area 47x7. Equation (10.92) 
involves two factors of 1+ z, one of which come from the fact that the observed photon energy is redshifted, 
and the other from the fact that the observed number of photons detected per unit time is redshifted by 
1 + z. Equations (10.91) and (10.92) imply that the luminosity distance dz is related to the circumferential 
distance x and the redshift z by 


dr =(1+z)a. (10.93) 


Why bother with the luminosity distance if it can be reduced to the circumferential distance x by dividing 
by a redshift factor? The answer is that, especially historically, fluxes of distant astronomical objects are 
often measured from images without direct spectral information. If the intrinsic luminosity of the object 
is treated as “known” (as with Cepheid variables and Type Ia supernovae), then the luminosity distance 
dr = \/L/(4rF) can be inferred without knowledge of the redshift. In practice objects are often measured 
with a fixed colour filter or set of filters, and some additional correction, historically called the K-correction, 
is necessary to transform the flux in an observed filter to a common band. 


10.19.2 Magnitudes 


The Hubble diagram of Type Ia supernova shown in Figure 10.1 has for its vertical axis the astronomers’ 
system of magnitudes, a system that dates back to the 2nd century BC Greek astronomer Hipparchus. 

A magnitude is a logarithmic measure of brightness, defined such that an interval of 5 magnitudes m 
corresponds to a factor of 100 in linear flux F. Following Hipparchus, the magnitude system is devised such 
that the brightest stars in the sky have apparent magnitudes of approximately 0, while fainter stars have 
larger magnitudes, the faintest naked eye stars in the sky being about magnitude 6. Traditionally, the system 
is tied to the star Vega, which is defined to have magnitude 0. Thus the apparent magnitude m of a star is 


mM = MVega — 2.5 log (F/Fvega) . (10.94) 


The absolute magnitude M of an object is defined to equal the apparent magnitude m that it would have if 
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it were 10 parsecs away, which is the approximate distance to the star Vega. Thus 
m—M = 5log (dzŁ/10 pc) , (10.95) 


where dz is the luminosity distance. The difference m — M is called the distance modulus. 


Exercise 10.9. Hubble diagram. Draw a theoretical Hubble diagram, a plot of luminosity distance dz 
versus redshift z, for universes with various values of Q, and Qm. The relation between dz and z is an elliptic 
integral of the first kind, so you will need to find a program that does elliptic integrals (alternatively, you can 
do the integral numerically). The elliptic integral simplifies to elementary functions in simple cases where 
the mass-energy density is dominated by a single component (either mass Qm = 1, or curvature Qk = 1, or 
a cosmological constant Q, = 1). 

Solution. Your model curves should look similar to those in Figure 10.1. 


10.20 Recombination 


The CMB comes to us from the epoch of recombination, when the Universe transitioned from being mostly 
ionized, and therefore opaque, to mostly neutral, and therefore transparent. As the Universe expands, the 
temperature of the cosmic background decreases as T « a~!. Given that the CMB temperature today is 
To © 3K, the temperature would have been about 3,000K at a redshift of about 1,000. This temperature 
corresponds to the temperature at which hydrogen, the most abundant element in the Universe, ionizes. Not 
coincidentally, the temperature of recombination is comparable to the ~ 5,800 K surface temperature of the 
Sun. The CMB and Sun temperatures differ because the baryon-to-photon number density is much greater 
in the Sun. 

The transition from mostly ionized to mostly neutral takes place over a fairly narrow range of redshifts, just 
as the transition from ionized to neutral at the photosphere of the Sun is rather sharp. Thus recombination 
can be approximated as occurring almost instantaneously. Aghanim et al. (2018) give the redshift of last 
scattering, where the photon-electron scattering (Thomson) optical depth was 1, 


Z = 1089.8 + 0.2 . (10.96) 
Hinshaw et al. (2012, supplementary data) give the age of the Universe at recombination, 


te = 376,000 + 4,000 yr . (10.97) 


10.21 Horizon 


Light can come from no more distant point than the Big Bang. This distant point defines what cosmologists 
traditionally refer to as the horizon (or particle horizon) of our Universe, located at infinite redshift, z = oo. 
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Figure 10.11 Spacetime diagram of a FLRW Universe in conformal coordinates and xj, in units of the present 
day Hubble distance c/Ho. The unfilled circle marks our position, which is taken to be the origin of the conformal 
coordinates. In conformal coordinates, light moves at 45° on the spacetime diagram. The diagram is drawn for a flat 
ACDM model with Q, = 0.7, Qm = 0.3, and a radiation density such that the redshift of matter-radiation equality 
is 3400, consistent with Aghanim et al. (2018). Horizontal lines are lines of constant cosmic scale factor a, labelled by 
their values relative to the present, ag = 1. Reheating, at the end of inflation, has been taken to be at redshift 1028. 
Filled dots mark the place that cosmologists traditionally call the horizon, at reheating, which is a place of large, but 
not infinite, redshift. Inflation offers a solution to the horizon problem because all points on the CMB within our past 
lightcone could have been in causal contact at an early stage of inflation. If dark energy behaves like a cosmological 
constant into the indefinite future, then we will have a future horizon. 


Equation (10.87) gives the geodesic distance between us at redshift zero and the horizon as 


“ed 
x\(horizon) = | <<. (10.98) 
0 H 
The standard ACDM paradigm is based in part on the proposition that the Universe had an early infla- 
tionary phase, §10.22. If so, then there is no place where the redshift reaches infinity. However, the redshift 


is large at reheating, when inflation ends, and cosmologists call this the horizon, 


huge d 
x) (horizon) = J i, (10.99) 
0 H 


Figure 10.11 shows a spacetime diagram of a FLRW Universe with cosmological parameters consistent 
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Figure 10.12 Redshift z of objects at fixed comoving distances as a function of the epoch aobs at which an observer 
observes them. The label on each line is the comoving distance x) in units of c/Ho. The diagram is drawn for a flat 
ACDM model with Q, = 0.7, Qm = 0.3, and a radiation density such that the redshift of matter-radiation equality 
is 3400. The present-day Universe, at aps = 1, is transitioning from a decelerating, matter-dominated phase to an 
accelerating, vacuum-dominated phase. Whereas in the past redshifts tended to decrease with time, in the future 
redshifts will tend to increase with time. 


with those of (Aghanim et al., 2018). In this model, the comoving horizon distance to reheating is 
«| (horizon) = 3.333 c/Ho = 14.5 Gpe = 47.2 Glyr . (10.100) 


The redshift of reheating in this model has been taken at z = 1078, but the horizon distance is insensitive to 
the choice of reheating redshift. 

The horizon should be distinguished from the future horizon, which Hawking and Ellis (1973) define to 
be the farthest that an observer will ever be able to see in the indefinite future. If the Universe continues 
accelerating, as it is currently, then our future horizon will be finite, as illustrated in Figure 10.11. 

A quantity that cosmologists sometimes refer to loosely as the horizon is the Hubble distance, defined 
to be 


Hubble distance = A ; (10.101) 


The Hubble distance sets the characteristic scale over which two observers can communicate and influence 
each other, which is smaller than the horizon distance. 
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The standard ACDM model has the curious property that the Universe is switching from a matter- 
dominated period of deceleration to a vacuum-dominated period of acceleration. During deceleration, objects 
appear over the horizon, while during acceleration, they disappear over the horizon. Figure 10.12 illustrates 
the evolution of the observed redshifts of objects at fixed comoving distances. In the past decelerating phase, 
the redshift of objects appearing over the horizon decreased rapidly from some huge value. In the future 
accelerating phase, the redshift of objects disappearing over the horizon will increase in proportion to the 
cosmic scale factor. 


10.22 Inflation 


Part of the Standard Model of Cosmology is the hypothesis that the early Universe underwent a period of 
inflation, when the mass-energy density was dominated by “vacuum” energy, and the Universe expanded 
exponentially, with a œx e#*. The idea of inflation was originally motivated around 1980 by the idea that 
early in the Universe the forces of nature would be unified, and that there is energy associated with that 
unification. For example, the inflationary energy could be the energy associated with Grand Unification 
of the U(1) x SU(2) x SU(3) forces of the standard model. The three coupling constants of the standard 
model vary slowly with energy, appearing to converge at an energy of around maur ~ 1018 GeV, not much 
less than the Planck energy of mp ~ 101° GeV. The associated vacuum energy density would be of order 
PGUT ~ mur in Planck units. 

Alan Guth (1981) pointed at that, regardless of theoretical arguments for inflation, an early inflationary 
epoch would solve a number of observational conundra. The most important observational problem is the 
horizon problem, Exercise 10.11. If the Universe has always been dominated by radiation and matter, and 
therefore always decelerating, then up to the time of recombination light could only have travelled a distance 
corresponding to about 1 degree on the cosmic microwave background sky, Exercise 10.10. If that were the 
case, then how come the temperature at points in the cosmic microwave background more than a degree 
apart, indeed even 180° apart, on opposite sides of the sky, have the same temperature, even though they 
could never have been in causal contact? Guth pointed out that inflation could solve the horizon problem 
by allowing points to be initially in causal contact, then driven out of causal contact by the acceleration 
and consequent exponential expansion induced by vacuum energy, provide that the inflationary expansion 
continued over a sufficient number e-folds, Exercise 10.11. Guth’s solution is illustrated in the spacetime 
diagram in Figure 10.11. 

Guth pointed out that inflation could solve some other problems, such as the flatness problem. However, 
most of these problems are essentially equivalent to the horizon problem, Exercise 10.12. 

A distinct basic problem that inflation solves is the expansion problem. If the Universe has always been 
dominated by a gravitationally attractive form of mass-energy, such as matter or radiation, then how come 
the Universe is expanding? Inflation solves the problem because an initial period dominated by gravitationally 
repulsive vacuum energy could have accelerated the Universe into enormous expansion. 

Inflation also offers an answer to the question of where the matter and radiation seen in the Universe 
today came from. Inflation must have come to an end, since the present day Universe does not contain 
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the enormously high vacuum energy density that dominated during inflation (the vacuum energy during 
inflation was vast compared to the present-day cosmological constant). The vacuum energy must therefore 
have decayed into other forms of gravitationally attractive energy, such as matter and radiation. The process 
of decay is called reheating. Reheating is not well understood, because it occurred at energies well above 
those accessible to experiment today. Nevertheless, if inflation occurred, then so also did reheating. 

Compelling evidence in favour of the inflationary paradigm comes from the fact that, in its simplest 
form, inflationary predictions for the power spectrum of fluctuations of the CMB fit astonishingly well to 
observational data, which continue to grow ever more precise. 


Exercise 10.10. Horizon size at recombination. 

1. Comoving horizon distance. Assume for simplicity a flat, matter-dominated Universe. From equa- 
tion (10.98), what is the comoving horizon distance xj as a function of cosmic scale factor a? 

2. Angular size on the CMB of the horizon at recombination. For a flat Universe, the angular 
size on the CMB of the horizon at recombination equals the ratio of the comoving horizon distance 
at recombination to the comoving distance between us and recombination. Recombination occurs at 
sufficiently high redshift that the latter distance approximates the comoving horizon at the present time. 
Estimate the angular size on the CMB of the horizon at recombination if the redshift of recombination 
iS Zrec © 1100. 


Exercise 10.11. The horizon problem. 

1. Expansion factor. The temperature of the CMB today is To ~ 3K. By approximately what factor 
has the Universe expanded since the temperature was some initial high temperature, say the GUT 
temperature T; ~ 1029 K, or the Planck temperature T, ~ 10°? K? 

2. Hubble distance. By what factor has the Hubble distance c/H increased during the expansion of 
part 1? Assume for simplicity that the Universe has been mainly radiation-dominated during this period, 
and that the Universe is flat. [Hint: For a flat Universe H? œ p, and for radiation-dominated Universe 
pxa“*] 

3. Comoving Hubble distance. Hence determine by what factor the comoving Hubble distance ry = 
c/(aH) has increased during the expansion of part 1. 

4. Comoving Hubble distance during inflation. During inflation the Hubble distance c/H remained 
constant, while the cosmic scale factor a expanded exponentially. What is the relation between the 
comoving Hubble distance zy = c/(aH) and cosmic scale factor a during inflation? [You should obtain 
an answer of the form xy x a’.| 

5. Number of e-foldings to solve the horizon problem. By how many e-foldings must the Universe 
have inflated in order to solve the horizon problem? Assume again, as in part 1, that the Universe 
has been mainly radiation-dominated during expansion from the Planck temperature to the current 
temperature, and that this radiation-dominated epoch was immediately preceded by a period of inflation. 
[Hint: Inflation solves the horizon problem if the currently observable Universe was within the Hubble 
distance at the beginning of inflation, i.e. if the comoving 77,9 now is less than the comoving Hubble 
distance x7; at the beginning of inflation. The ‘number of e-foldings’ is In(a¢/a;), where In is the natural 
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logarithm, and a; and aç are the cosmic scale factors at the beginning (i for initial) and end (f for final) 
of inflation.] 


Exercise 10.12. Relation between horizon and flatness problems. Show that Friedmann’s equa- 
tion (10.30a) can be written in the form 


Q-1= «22, , (10.102) 


where zy = c/(aH) is the comoving Hubble distance. Use this equation to argue in your own words how the 
horizon and flatness problems are related. 


10.23 Evolution of the size and density of the Universe 


Figure 10.13 shows the evolution of the cosmic scale factor a as a function of time t predicted by the standard 
flat ACDM model, coupled with a plausible depiction of the early inflationary epoch. The parameters of the 
model are the same as those for Figure 10.11. In the model, the Universe starts with an inflationary phase, 
and transitions instantaneously at reheating to a radiation-dominated phases. Not long before recombination, 
the Universe goes over to a matter-dominated phase, then later to the dark-energy-dominated phase of today. 
The relation between cosmic time t and cosmic scale factor a is given by equation (10.70), and some relevant 
analytic results are in Exercises 10.6 and 10.7. 

Figure 10.13 also shows the evolution of the Hubble distance c/H, which sets the approximate scale within 
which regions are in causal contact. The Hubble distance is constant during vacuum-dominated phases, but 
is approximately proportional to the age of the Universe at other times. The Figure illustrates that regions 
that are in causal contact prior to inflation can fly out of causal contact during the accelerated expansion 
of inflation. Once the Universe transitions to a decelerating radiation- or matter-dominated phase, regions 
that were out of causal contact can come back into causal contact, inside the Hubble distance. 

Since inflation occurred at high energies inaccessible to experiment, the energy scale of inflation is unknown, 
and the number of e-folds during which inflation persisted is unknown. Figure 10.13 illustrates the case where 
the energy scale of inflation is around the GUT scale, and the number of e-folds is only slightly greater than 
the number necessary to solve the horizon problem. Figure 10.13 does not attempt to extrapolate to what 
might possibly have happened prior to inflation. 

Figure 10.14 shows the mass-energy density p as a function of time t for the same flat ACDM model as 
shown in Figure 10.13. Since the Universe here is taken to be flat, the density equals the critical density 
at all times, and is proportional to the inverse square of the Hubble distance c/H plotted in Figure 10.18. 
The energy density is constant during epochs dominated by vacuum energy, but decreases approximately as 
p X t7? at other times. 
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Figure 10.13 Cosmic scale factor a and Hubble distance c/H as a function of cosmic time t, for a flat ACDM model 
with the same parameters as in Figure 10.11. In this model, the Universe began with an inflationary epoch where the 
density was dominated by constant vacuum energy, the Hubble parameter H was constant, and the cosmic scale factor 
increased exponentially, a x e#*. The initial inflationary phase came to an end when the vacuum energy decayed into 
radiation energy, an event called reheating. The Universe then became radiation-dominated, evolving as a œ t/2, 
At a redshift of z-q % 3400 the Universe passed through the epoch of matter-radiation equality, where the density 
of radiation equalled that of (non-baryonic plus baryonic) matter. Matter-radiation equality occurred just prior to 
recombination, at zrec & 1090. The Universe remained matter-dominated, evolving as a œ t?/%, until relatively recently 
(from a cosmological perspective). The Universe transitioned through matter-dark energy equality at za œ% 0.4. The 
dotted line shows how the cosmic scale factor and Hubble distance will evolve in the future, if the dark energy is a 
cosmological constant, and if it does not decay into some other form of energy. 


10.24 Evolution of the temperature of the Universe 


Figure 10.15 shows the radiation (photon) temperature T as a function of time t corresponding to the 
evolution of the scale factor a and temperature T shown in Figures 10.13 and 10.14. 
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Figure 10.14 Mass-energy density p of the Universe as a function of cosmic time t corresponding to the evolution of 
the cosmic scale factor shown in Figure 10.13. 


A system of photons in thermodynamic equilibrium has a blackbody distribution of energies. The CMB 
has a precise blackbody spectrum, not because it is in thermodynamic equilibrium today, but rather because 
the CMB was in thermodynamic equilibrium with electrons and nuclei at the time of recombination, and the 
CMB has streamed more or less freely through the Universe since recombination. A thermal distribution of 
relativistic particles retains its thermal distribution in an expanding FLRW universe (albeit with a changing 
temperature), Exercise 10.13. 

The evolution of the temperature of photons in the Universe can be deduced from conservation of entropy. 
The Friedmann equations imply the first law of thermodynamics, §10.9.2, and thus enforce conservation of 
entropy per comoving volume (but see Concept Question 30.5). Entropy is conserved in a FLRW universe 
even when particles annihilate with each other. For example, electrons and positrons annihilated with each 
other when the temperature fell through T ~% me = 511 keV, but the entropy lost by electrons and positrons 
was gained by photons, for no net change in entropy, Figure 10.16 and Exercise 10.21. 

In the real Universe, entropy increases as a result of fluctuations away from the perfect homogeneity and 
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Figure 10.15 Radiation temperature T of the Universe as a function of cosmic time t corresponding to the evolution of 
the cosmic scale factor shown in Figure 10.13. The temperature during inflation was the Hawking temperature, equal 
to H/(27m) in Planck units. After inflation and reheating, the temperature decreases as T «x a~!, modified by a factor 
depending on the effective entropy-weighted number gs of particle species, equation (10.104). In this plot, the effective 
number gs of relativistic particle species has been approximated as changing abruptly at three discrete temperatures, 
electron-positron annihilation, the QCD phase transition, and the electroweak phase transition, Table 10.4. 


isotropy assumed by the FLRW geometry. By far the biggest repositories of entropy in today’s Universe are 
black holes, principally supermassive black holes. However, black holes are irrelevant to the CMB, since the 
CMB has propagated essentially unchanged since recombination. It is fine to compute the temperature of 
cosmological radiation from conservation of cosmological entropy. 


The entropy of a system in thermodynamic equilibrium is approximately one per particle, Exercise 10.18. 
The number of particles in the Universe today is dominated by particles that were relativistic at the time 
they decoupled, namely photons and neutrinos, and these therefore dominate the cosmological entropy. The 
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Figure 10.16 Comoving number densities an of photons y, neutrinos v, electrons e, and positrons Z as a function of 
temperature T around the temperature me near which electrons and positrons annihilate. Annihilating electrons and 
positrons dump their entropy into photons, increasing the comoving density of photons, while conserving total entropy 
per comoving volume. The comoving densities are normalized to any = 1 at the present time. The calculations are 
described in Exercise 10.21. 


ratio mp = np/ny of baryon to photon number in the Universe today is less than a billionth, 


-3 
— Nb € Np -jô QA? To 
= — = —— = 6.1 x10 — 10.103 
T ay m 0.0224 (2725K) ’ ( ) 
where e, = n*To/ (380¢(3)) = 2.701 To is the mean energy per photon (Exercise 10.15), and mp = 939 MeV 
is the approximate mass per baryon. The value is as reported by the Planck team (Aghanim et al., 2018). 
Conservation of entropy per comoving volume implies that the photon temperature T at redshift z is 
related to the present day photon temperature Ty by (Exercise 10.19) 


T gs0 1/3 
—=(1 — 10.104 
paatra(S)” (10.104) 


where g, is the entropy-weighted effective number of relativistic particle species. 

The other major contributors to cosmological entropy today, besides photons, are neutrinos and antineutri- 
nos. Neutrinos decoupled at a temperature of about T ~ 1 MeV. Above that temperature weak interactions 
were fast enough to keep neutrinos and antineutrinos in thermodynamic equilibrium with protons and neu- 
trons, hence with photons, but below that temperature neutrinos and antineutrinos froze out. 

Neutrino oscillation data indicate that at least 2 of the 3 neutrino types have masses that would make 
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Table 10.4: Effective entropy-weighted number of relativistic particle species 


z 2 i 
268 £2 8 
Temperature T particles = a 8 z £ E gs 
ao H g E 
ggas g 
a0 g 
hot 1 1 74 
T $0.5 MeV ene 2(1+ 553) =301 
neutrinos Ve, Vu, Vr i 1 3 3 811 
pingo y 1 1 7 
0.5 MeV STS 200 MeV neutrinos Ve, Vus Vr l 1 3 3 2( 1+ 3° = 10.75 
electron e i 2. 1 2 
photon y 1 1 
SU(3) gluons 1 8 7 
200 MeV STS 100GeV neutrinos ve, Vu, Vr i 1 3 3 2 € + £25) = 61.75 
leptons e, u $ 2 2 4 
quarks u, d, s l 2 3 2 3 18 
SU(2) x Uy(1) bosons 1 341 
SU(3) gluons 1 8 
i 7 
T > 100 GeV complex Higgs 0 2 2 (u + 15) = 106.75 
neutrinos Ve, Vu, Vr i 1 3 3 8 
leptons e, u, T $ 2 3 6 
quarks u, d, c, s, t, b l 2 3 2 3 36 


cosmic neutrinos non-relativistic at the present time, §42.4.15. Neutrino oscillations fix only differences 
in squared masses of neutrinos, leaving unconstrained the absolute mass levels. If the lightest neutrino 
has mass m, < 107*eV, equation (10.111), then it would remain relativistic at the present time, and it 
would produce a Cosmic Neutrino Background (CNB) analogous to the CMB. Because neutrinos froze out 
before eé-annihilation, annihilating electrons and positrons dumped their entropy into photons, increasing 
the temperature of photons relative to that of neutrinos. The temperature of the CNB today would be, 
Exercise 10.20, 
4\ 13 

Ly (=) 2.725 K = 1.945K . (10.105) 
Sadly, neutrinos interact too weakly for such a background to be detectable with current technology. Like 
the CMB, the CNB should have a (redshifted) thermal distribution inherited from being in thermodynamic 
equilibrium at T ~ 1 MeV. 
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Table 10.4 gives approximate values of the effective entropy-weighted number g, of relativistic particle 
species over various temperature ranges. The extra factor of two for gs in the final column of Table 10.4 
arises because every particle species has an antiparticle (the two spin states of a photon can be construed 
as each other’s antiparticle). The entropy of a relativistic fermionic species is 7/8 that of a bosonic species, 
Exercise 10.16, equation (10.141). The difference in photon and neutrino temperatures leads to an extra 
factor of 4/11 in the value of g, today, which, with 1 bosonic species (photons) and 3 fermionic species 
(neutrinos), together with their antiparticles, is, equation (10.152), 


74 43 
gso (+553) = =39 (10.106) 


A more comprehensive evaluation of gs is given by Kolb and Turner (1990, Fig. 3.5), and Aghanim et 
al. (2018, Fig. 36). Over the range of energies T ẸṢ 1 TeV covered by the standard model of physics, there 
are four principal epochs in the evolution of the effective number g, of relativistic species, punctuated by 
electron-positron annihilation at T ~ 0.5 MeV, the QCD phase transition from bound nuclei to free quarks 
and gluons at T ~ 200 MeV, and the electroweak phase transition above which all standard model particles 
are relativistic at T ~ 100 GeV. There could well be further changes in the number of relativistic species 
at higher temperatures, for example if supersymmetry becomes unbroken at some energy, but at present no 
experimental data constrain the possibilities. 


10.25 Neutrino mass 


Neutrinos are created naturally by nucleosynthesis in the Sun, and by interaction of cosmic rays with the 
atmosphere. When a neutrino is created (or annihilated) by a weak interaction, it is created in a weak 
eigenstate. Observations of solar and atmospheric neutrinos indicate that neutrino species oscillate into each 
other, implying that the weak eigenstates are not mass eigenstates. The weak eigenstates are denoted Ve, Vu, 
and v,, while the mass eigenstates are denoted v1, v2, and v3. Oscillation data yield mass squared differences 
between the three mass eigenstates (Forero, Tortola, and Valle, 2012) 


[Am2]? = (7.6 4 0.2) x 10-° eV? solar neutrinos , (10.107a 
|Amsi|? = (2.4 0.1) x 10-3 eV? atmospheric neutrinos . (10.107b 


The data imply that at least two of the neutrino types have mass. The squared mass difference between m 
and mə implies that at least one of them must have a mass 


My, Or My, > V7.6 x 10-5 eV? = 0.01eV . (10.108 


The squared mass difference between mı and mg implies that at least one of them must have a mass 


My, Or My, > V2.4 x 10-3 eV? x 0.05eV . (10.109) 


The ordering of masses is undetermined by the data. The natural ordering is mı < m2 < m3, but an inverted 
hierarchy m3 < mı % mz is possible. Constraints from the CMB impose an upper limit on the sum of the 
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masses of the three neutrino types (Aghanim et al., 2018), 
Som, < 0.12eV . (10.110) 


A direct measurement by the KATRIN experiment yields an upper limit of m,, < 1.1eV on the mass of the 
electron neutrino (Aker et al., 2019). 

The CNB temperature, equation (10.105), is T, = 1.945K = 1.676 x 10-*eV. The redshift at which a 
neutrino of mass m, becomes non-relativistic is then 
mM, My 


T, 1.676 x 10-4eV ` 


Neutrinos of masses 0.01 eV and 0.05 eV would have become non-relativistic at z, œ% 60 and 300 respectively. 
Only a neutrino of mass <10~4eV would remain relativistic at the present time. 


l+z= (10.111) 


The masses from neutrino oscillation data suggest that at least two species of cosmological neutrinos are 
non-relativistic today. If so, then the neutrino density Q, today is related to the sum ` m, of neutrino 
masses by 


OQ, = ae =5.4x 1074 ( XM ) h2. (10.112) 

The number and entropy densities of neutrinos today are unaffected by whether they are relativistic, so 
the effective number- and entropy-weighted numbers gn,o and gso are unaffected. On the other hand the 
energy density of neutrinos today does depend on whether or not they are relativistic. If just one neutrino 
type is relativistic and the other two are non-relativistic, then the effective energy-weighted number gp,o of 
relativistic species today is 


4\437 
90.0 =2+ (=) g 2a 245. (10.113) 


The density Q, of relativistic particles today is Q, = (gp,0/2)Qy. 


10.25.1 The neutrino mass puzzle 


The experimental fact that neutrinos have mass is puzzling. The other salient experimental property of 
neutrinos is that they are left-handed (and anti-neutrinos are right-handed). A particle whose spin and 
momentum point in the same direction is called right-handed, while a particle whose spin and momentum 
point in opposite directions is called left-handed. The handedness of a particle is also called its chirality. For 
massless particles, chirality is Lorentz-invariant: a massless particle that is purely left-handed in one frame 
remains purely left-handed in any Lorentz-transformed frame. 

The problem is that a particle cannot be both massive and purely left- or right-handed. A massive particle 
that looks left-handed, spin anti-aligned with its momentum, in one frame, looks right-handed to an observer 
who overtakes the particle from behind. This does not immediately contradict the experimental fact that 
neutrinos are both massive and left-handed, since in all experiments neutrinos are highly relativistic, in which 
case the left-handed components are boosted exponentially compared to the right-handed components, as 
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Figure 10.17 According to the Standard Model of physics, a massive fermion acquires its mass by interacting with 
the Higgs field. The interaction flips the fermion between left- and right-handed chiralities as it propagates through 
spacetime, as illustrated schematically in this spacetime diagram. In the fermion’s rest frame, its wavefunction is a 
linear combination of left- and right-handed chiralities with equal amplitudes (in absolute value). Boosting the fermion 
in a direction opposite to its spin amplifies the left-handed component by a boost factor e?/2 and deamplifies the 
right-handed component by e~9/2, so a fermion moving relativistically appears almost entirely left-handed. 


illustrated in Figure 10.17. But in principle an observer could overtake a left-handed neutrino, which the 
observer would then see as right-handed. But then where are the right-handed neutrinos? It is not enough to 
say that right-handed neutrinos are too weakly interacting to have been observed. A right-handed neutrino 
observed from behind would look like a left-handed neutrino and thereby become interacting, so right-handed 
neutrinos should make themselves felt in cosmology. 

A leading idea to solve the problem of neutrino mass is that neutrinos are so-called Majorana fermions, 
which have the defining property that when observed from behind they not only switch from left- to right- 
handed, but also from particle to antiparticle. Thus a left-handed neutrino observed from behind looks like 
a right-handed antineutrino. Switching from particle to antiparticle would violate charge conservation, so 
other fermions, namely electrons and quarks, cannot be Majorana fermions because they possess conserved 
charges (electric charge and color charge). Left-handed neutrinos have weak isospin and weak hypercharge, 
but those charges are not strictly conserved at energies below the ~ 100 GeV scale at which the electroweak 
Uy (1) x SU(2) symmetry breaks down to the Ugm(1) electromagnetic symmetry. Thus at energies below the 
electroweak scale, neutrinos can be massive Majorana fermions without violating any strict conservation law. 


The problem of neutrino mass is resumed in §42.3.1. 
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10.26 Occupation number, number density, and energy-momentum 


A careful treatment of the evolution of the number and energy-momentum densities of species in a FLRW 
universe requires consideration of their momentum distributions. 

In this section, including the Exercises, units c, h, and G are kept explicit, but the Boltzmann constant is 
set to unity, k = 1, which is equivalent to measuring temperature T in units of energy. 


10.26.1 Occupation number 


Choose a locally inertial frame attached to an observer. The distribution of a particle species in the observer’s 
frame is described by a dimensionless scalar occupation number f(t, x, p) that specifies the number dN of 
particles at the observer’s position x” = {t,a} with momentum p% = {F,p} in a dimensionless Lorentz- 
invariant 6-dimensional volume of phase space, 
3r 3 

dN = f(t, x,p) oo i (10.114) 
with g being the number of spin states of the particle. Here d?r and dp denote the proper spatial and 
momentum 3-volume elements in the observer’s locally inertial frame. The quantum mechanical normalization 
factor (27h)? ensures that f counts the number of particles per free-particle quantum state. If the particle 
species has rest mass m, then its energy EF is related to its momentum by E? — p?c? = m?c*, which explains 
why the occupation number is treated as a function only of momentum p. 

The phase-space volume element. d°r dp is a scalar, invariant under Lorentz transformations of the ob- 
server’s frame. In fact, as shown in §4.22.1, the phase-space volume element is invariant under any canonical 
transformation of coordinates and momenta, which includes not only Lorentz transformations but also a 
broad range of other transformations. For example, in place of d?r dp it would be possible to use the 
phase-space volume element d°x d?r formed out of the spatial comoving coordinates x“ and their conjugate 
generalized momenta Ta. 

The Lorentz invariance of the phase-space volume element d?r d°p can be demonstrated more simplistically 
as follows. First, the 3-volume element dr is related to the scalar 4-volume element. dt d?r by 


dt dr 
dX 


since dt/d\ = E. The left hand side of equation (10.115) is the derivative of the observer’s 4-volume dt d?r 
with respect to the affine parameter dà = dr/m, with 7 the observer’s proper time. Since both the 4- 
volume and affine parameter are scalars, it follows that E dr is a scalar (actually, dt dr = dt dr+dr?dr? is 
a pseudoscalar, not a scalar, as is E d?r = p? dr'dr?dr?; see Chapter 15). Second, the momentum 3-volume 
element d°p is related to the scalar 4-volume element dE d?p by 


= Ear , (10.115) 


a ee 3, _ ap 
(Ef — pc — m ct) dE d’p = JE ’ (10.116) 


where the Dirac delta-function enforces conservation of the particle rest mass m. The 4-volume d‘p is a 
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scalar, and the delta-function is a function of a scalar argument, hence d3p/E is likewise a scalar (again, 


dE d*p = —dpodp,dp2dp3 and d°p/E = dp,dp2dp3/p® are actually pseudoscalars, not scalars). Since E d?r 


and d3p/E are both Lorentz-invariant (pseudo-)scalars, so is their product, the phase space volume d?r d3p 


(which is a genuine scalar). 


10.26.2 Occupation number in a FLRW universe 


The homogeneity and isotropy of a FLRW universe imply that, for a comoving observer, the occupation 
number f is independent of position and direction, 


f(t,x,p) = f(t, p). (10.117) 


10.26.3 Number density 


In the locally inertial frame of an observer, the number density and flux of a particle species form a 4-vector 


nk, 


3, 
në = fe f(t,x,p) aoc . (10.118) 


In particular, the number density n°, with units number of particles per unit proper volume, is the time 


d3p 
mm ff ae (10.119) 


component of the number current, 


In a FLRW universe, the spatial components of the number flux vanish by isotropy, so the only non- 
vanishing component is the time component n°, which is just the proper number density n of the particle 
species, 


n=n = f fw) oe l (10.120) 


10.26.4 Energy-momentum tensor 


In the locally inertial frame of an observer, the energy-momentum tensor T*! of a particle species is 


d3p 

T! = | p¥p! f(t,x,p) 2 . 10.121 

fe p f(t, £, p) Enh? ( ) 
For a FLRW universe, homogeneity and isotropy imply that the energy-momentum tensor in the locally 

inertial frame of a comoving observer is diagonal, with time component T° = p, and isotropic spatial 

components T? = p ôap. The proper energy density p of a particle species is 


p= pete ) ee À (10.122) 
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and the proper isotropic pressure p is (don’t confuse pressure p on the left hand side with momentum p on 
the right hand side) 


2 2 
_ [Pp g4np "dp 


10.27 Occupation numbers in thermodynamic equilibrium 


Frequent collisions tend to drive a system towards thermodynamic equilibrium. Electron-photon scattering 
keeps photons in near equilibrium with electrons, while Coulomb scattering keeps electrons in near equi- 
librium with ions, primarily hydrogen ions (protons) and helium nuclei. Thus photons and baryons can be 
treated as having unperturbed distributions in mutual thermodynamic equilibrium. 

In thermodynamic equilibrium at temperature T, the occupation numbers of fermions, which obey an 
exclusion principle, and of bosons, which obey an anti-exclusion principle, are 


1 
————— fermion, 
fad e0" +1 (10.124) 
cŒ-a)/ T] boson , 
where p is the chemical potential of the species. In the limit of small occupation numbers, f < 1, equivalent 
to large negative chemical potential, 4 — —large, both fermion and boson distributions go over to the 
Boltzmann distribution 


f =e -®+4/T Boltzmann . (10.125) 


Chemical potential is the thermodynamic potential associated with conservation of number. There is a 
distinct potential for each conserved species. For example, radiative recombination and photoionization of 
hydrogen, 

pteoH+y, (10.126) 


separately preserves proton and electron number, hydrogen being composed of one proton and one electron. 
In thermodynamic equilibrium, the chemical potential wy of hydrogen is the sum of the chemical potentials 
[tp and jie of protons and electrons, 


Hp + He = Hu - (10.127) 

Photon number is not conserved, so photons have zero chemical potential, 
py =0, (10.128) 
which is closely associated with the fact that photons are their own antiparticles. For photons, which are 


bosons, the thermodynamic distribution (10.124) becomes the Planck distribution, 


1 
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Exercise 10.13. Distribution of non-interacting particles initially in thermodynamic equilib- 
rium. The number dN of a particle species in an interval d?rd?p of phase space (proper positions r and 
proper momenta p, not to be confused with the same symbol p for pressure) for an ideal gas of free particles 
(non-relativistic, relativistic, or anything in between) in thermodynamic equilibrium at temperature T and 
chemical potential u is 


g drd?p 
dN = f = 10.130 
where the occupation number f is (units k = 1, where k is the Boltzmann constant) 
1 


with a + sign for fermions and a — sign for bosons. The energy E and momentum p of particles of mass m 
are related by E? = p?c? + m?ct. For bosons, the chemical potential is constrained to satisfy u < E, but for 
fermions u may take any positive or negative value, with u >> E corresponding to a degenerate Fermi gas. 
As the Universe expands, proper distance increase as r x a, while proper momenta decrease as p x a~!, so 
the phase space volume dĉrd?p remains constant. 

1. Occupation number. Write down an expression for the occupation number f(p) of a distribution of 
particles that start in thermodynamic equilibrium and then remain non-interacting while the Universe 
expands by a factor a. 

2. Relativistic particles. Conclude that a distribution of non-interacting relativistic particles initially 
in thermodynamic equilibrium retains its thermodynamic equilibrium distribution in a FLRW universe 
as long as the particles remain relativistic. How do the temperature T and chemical potential u of the 
relativistic distribution vary with cosmic scale factor a? 

3. Non-relativistic particles. Show similarly that a distribution of non-interacting non-relativistic par- 
ticles initially in thermodynamic equilibrium remains thermal. How do the temperature T and chemical 
potential u — m of the non-relativistic distribution vary with cosmic scale factor a? 

4. Transition from relativistic to non-relativistic. What happens to a distribution of non-interacting 
particles that are relativistic in thermodynamic equilibrium, but redshift to being non-relativistic? 


Exercise 10.14. The first law of thermodynamics with non-conserved particle number. As seen 
in §10.9.2, the first law of thermodynamics in the form 


T d(a*s) = d(a? p) + pd(a®) = 0 (10.132) 


is built into Friedmann’s equations. But what happens when for example the temperature falls through the 
temperature T ~ 0.5 MeV at which electrons and positrons annihilate? Won’t there be entropy production 
associated with eé annihilation? Should not the first law of thermodynamics actually say 


T d(a’s) = d(a’p) + pd(a®) — X` ux d(a?nx) , (10.133) 
X 


with the last term taking into account the variation in the number a?nx of various species X? 
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Solution. Each distinct chemical potential ux is associated with a conserved number, so the additional 
terms contribute zero change to the entropy, 


Sux d(a’nx) =0, (10.134) 
X 
as long as the species are in mutual thermodynamic equilibrium. For example, positrons and electrons in 
thermodynamic equilibrium satisfy we = — He, and 
ue d(a®ne) + pe d(a?ne) = pe d(a?ne — anz) =0 , (10.135) 


3ne — ane between electron and positron number is conserved. Thus 


which vanishes because the difference a 
the entropy conservation equation (10.132) remains correct in a FLRW universe even when number changing 


processes are occurring. 


Exercise 10.15. Number, energy, pressure, and entropy of a relativistic ideal gas at zero chem- 

ical potential. The number density n, energy density p, and pressure p of an ideal gas of a single species 

of free particles are given by equations (10.120), (10.122), and (10.123), with occupation number (10.131). 

Show that for an ideal relativistic gas of g bosonic species in thermodynamic equilibrium at temperature T 

and zero chemical potential, 4 = 0, the number density n, energy density p, and pressure p are (units k = 1; 

number density n in units 1/volume, energy density p and pressure p in units energy /volume) 
Ç(3)T? 3 w TA 


MEIRE’ PT PEIER ( ) 


where ¢(3) = 1.2020569 is a Riemann zeta function. The entropy density s of an ideal gas of free particles 
in thermodynamic equilibrium at zero chemical potential is 


cs 
T 
Conclude that the entropy density s of an ideal relativistic gas of g bosonic species in thermodynamic 
equilibrium at temperature T and zero chemical potential is (units 1/volume) 
Qn?T? 
$= JEER ` 
Exercise 10.16. A relation between thermodynamic integrals. Prove that 


oo „n—1 d co „pn—l1 
y lafi ge LA (10.139) 
0 ev + 1 0 ev — 1 


(10.137) 


(10.138) 


[Hint: Use the fact that (e” + 1)(e” — 1) = (e?” — 1).] Hence argue that the ratios of number, energy, and 
entropy densities of relativistic fermionic (f) to relativistic bosonic (b) species in thermodynamic equilibrium 
at the same temperature are 


Np 4 : Pb Sb 8 ( ) 


Conclude that if the number n, energy p, and entropy s of a mixture of bosonic and fermionic species in 
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thermodynamic equilibrium at the same temperature T are written in the form of equations (10.136) and 
(10.138), then the effective number-, energy-, and entropy-weighted numbers g of particle species are, in 
terms of the number g» of bosonic and gz of fermionic species, 


3 7 
In = Jb + 4% > Jp Z Is = Jb + git : (10.141) 


Exercise 10.17. Relativistic particles in the early Universe had approximately zero chemical 
potential. Show that the small particle-antiparticle symmetry of our Universe implies that to a good ap- 
proximation relativistic particles in thermodynamic equilibrium in the early Universe had zero chemical 
potential. 

Solution. The chemical potentials of particles X and antiparticles X in thermodynamic equilibrium are 
necessarily related by 


uł = -Hx . (10.142) 
If the particle-antiparticle asymmetry is denoted n, defined for relativistic particles by 
nx —nx =x , (10.143) 


then ux /T ~ 7. More accurately, to linear order in n, 


2 
[Lx T { 1 (bosons) (10.144) 


eos ¢(3) 2 (fermions) 


Exercise 10.18. Entropy per particle. The entropy of an ideal gas of free particles in thermodynamic 
equilibrium is 
_ ptpapen 
s= —_—_.. 


10.145 
- (10.145) 


Argue that the entropy per particle s/n is a quantity of order unity, whether particles are relativistic or 
non-relativistic. 

Solution. For relativistic bosons with zero chemical potential, equations (10.136) and (10.138) imply that 
the entropy per particle is 


= 3.6 (bosons), 
= 4.2 (fermions) . 


(10.146) 


OIN = 


s 274 
— = x 
n  45¢(3) 

For a non-relativistic species, the number density n is related to the temperature T and chemical potential 


u by 


T \3/2 
n= 9 (5) leo m)/T (10.147) 


Under cosmological conditions, the occupation number of non-relativistic species was small, e#7™)/T < 1. 
However, tiny occupation numbers correspond to values of (u — m)/T that are only logarithmically large 
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(negative). The entropy per particle of a non-relativistic species is 


g ( mT aia 
1 (25) 


which remains modest even if the argument of the logarithm is huge. 


Bs? Aly 
n 2 


, (10.148) 


Exercise 10.19. Photon temperature at high redshift versus today. Use entropy conservation, a?s = 


constant, to argue that the ratio of the photon temperature T at redshift z in the early Universe to the photon 
temperature To today is as given by equation (10.104). 


Exercise 10.20. Cosmic Neutrino Background. Neutrino oscillation data imply mass squared differ- 
ences that indicate that at least 2 of the 3 neutrino types are massive today, equations (10.108) and (10.109). 
The oscillation data do not constrain the offset from zero mass. A neutrino of mass < 1074 eV would remain 
relativistic at the present time, equation (10.111), and would produce a Cosmic Neutrino Background. Neutri- 
nos that are non-relativistic today would have clustered gravitationally, similar to collisionless non-baryonic 
dark matter, except that the fermionic character of neutrinos means that they could become degenerate 
(occupation number almost 1) in regions of high density, such as in the cores of galaxies. 

1. Temperature of the CNB. Weak interactions were fast enough to keep neutrinos in thermodynamic 
equilibrium with protons and neutrons, hence with photons, electrons, and positrons up to just before 
eé annihilation, but then neutrinos decoupled. When electrons and positrons annihilated, they dumped 
their entropy into that of photons, leaving the entropy of neutrinos unchanged. Argue that conservation 
of comoving entropy implies 


7 
aT’ (o + 59e) = T3 Gy; (10.149a) 
erg = T° gr, (10.149b) 


where the left hand sides refer to quantities before eë annihilation, which happened at T ~ 0.5 MeV, and 
the right hand sides to quantities after eë annihilation (including today). Deduce the ratio of neutrino 
to photon temperatures today, 

T, 

ee 10.150 

7 (10.150) 
Does the temperature ratio (10.150) depend on the number of neutrino types? What is the neutrino 
temperature today in K, if the photon temperature today is 2.725 K? 


2. Effective number of relativistic particle species. Because the temperatures of photons and neutri- 
nos are different, the effective number g of relativistic species today is not given by equations (10.141). 
What are the effective number-, energy-, and entropy-weighted numbers gn,0, gp,0, and gso of relativistic 
particle species today? What are their arithmetic values if the relativistic species consist of photons and 
three species of neutrino? How are these values altered if, as is likely, neutrinos today are non-relativistic? 
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Solution. The ratio of neutrino to photon temperatures after eë annihilation is 


T, 1/3 4\1/3 
E ( 2, ) = (=) = 0.714. (10.151) 
Ty Gy + 3 Ge 11 


No, the temperature ratio does not depend on the number of neutrino types. The ratio depends on neutrinos 
having decoupled a short time before eé@-annihilation. Equation (10.151) implies that the CNB temperature 
is given by equation (10.105). With 2 bosonic degrees of freedom from photons, and 6 fermionic degrees of 
freedom from 3 relativistic neutrino types, the effective number-, energy-, and entropy-weighted number of 
relativistic degrees of freedom is 


Ps 43, 40 
momar (7) ge a eee 
TAT 4\*7 
= -g =2+( > -6=3. 10.152b 
Ipo n+ (Z) 59 (=) 58 = 3.36 , (10.152b) 
DAT 47, 43 
pS ~g,=2+—-6=— =391. 10.152 
a n+ (F) 3° is? ere) 


Neutrinos today interact too weakly to annihilate, so their number and entropy today is that of relativistic 
species even if they are non-relativistic today. However, their energy density today is not that of relativistic 
particles. 


Exercise 10.21. Abundance of electrons and positrons in thermodynamic equilibrium. Calcu- 
late and plot the comoving number densities a?n of photons, electrons and positrons in thermodynamic 
equilibrium as the temperature T cooled through the electron mass mass me. 
Solution. The results are shown in Figure 10.16. 

Start by considering the more general situation of an ideal gas of any species, either fermionic or bosonic, 
rest mass m, in thermodynamic equilibrium at temperature T and chemical potential u in a volume V. The 
logarithm of the grand partition function Zg of such an ideal gas is (units c = k = 1) 


dp 
InZe=V =l È E Erie] ISR. 10.153 
n Za / n e OTR ( ) 
where the + signs are + for fermions, — for bosons. The laws of thermodynamics state that energy density p, 
number density n, and pressure p (not to be confused with same symbol for momentum p) in thermodynamic 
equilibrium are given by partial derivatives of In Zg with respect to —1/T, /T, and V, 


Z 1 HM) | P 
For an ideal gas, In ZG is proportional to volume V, and 
InZg_ p p pn 
=H= 10.155 
eo © o> ( ) 


where s is the entropy density. 
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At the present time, the observed small baryon-to-photon ratio n/n. of the Universe implies a similarly 
small electron-to-photon ratio n/n, from equations (10.103) and (31.8), 


Ne — finn 
Ny Ny 


= 5.4 x 1071 . (10.156) 


The small electron-to-photon ratio today implies a small electron-positron asymmetry before electron- 
positron annihilation, implying pe/T < 1 before electron-positron annihilation. 

As long as the particle-antiparticle symmetry is small when relativistic, an approximation to the grand 
partition function that holds asymptotically at both high and low temperatures, and is accurate to better 
than 5% at intermediate temperatures, is 


gVT? 


m 
ln Za x a7 aps 


3/2 
etm) /T cp (1 + C1 =) ; (10.157) 


where the constants co and cı for respectively fermions and bosons are 
4 


7 T T 13 
= 4 —, 1 p — & 41.894, 2.1 =(—> = {0. : ; 10.1 
Co i \ 15 {1.894, 2.165}, cı (5) {0.759, 0.695} (10.158) 


The partial derivatives (10.154) of the approximate logarithmic grand partition function (10.157) yield the 
number density n, energy density p, and pressure p, 


gT? ( —m)/T m 3/2 
ap a (1 F an) (10.159a) 
pxn(m+t+ qT), (10.159b) 
pent, (10.159c) 
where the factor q is 
3 + 3ce;m/T 

=o 2 10.160 
9= Th am/T ( ) 

which varies from q = 3 at T > m to q = 3 at T << m. The entropy density s is 
pl ERM ity |e, (10.161) 

T T 


The entropy in photons, which have q = 3 and m = u = 0, is s, = 4n,. The total entropy in all particles 
can be written 


S = 2gsny , (10.162) 


which defines the effective entropy-weighted number g, of relativistic species. The total comoving entropy 


as is conserved. Conservation of comoving entropy implies that the cube of the product of scale factor a and 


temperature T is inversely proportional to the effective entropy-weighted number g, of relativistic species, 


T\> Qs 
(E>) vat. (10.163) 
aoTo Js 
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In the problem being considered, when electrons and positrons annihilate, they dump their entropy into 
photons, conserving the total comoving entropy of photons, electrons, and positrons as the Universe expands. 
The effective entropy-weighted number gs of photons y, electrons e, and positrons é through electron-positron 
annihilation is approximately 


8 1 Me — He Me + He 
s= z— xz l4 1 e FF e 1 e FF; E 
g se [im + (1440+ T \n + (1404 T ) ne| 


i: 


ay 7 Me He .. —m./T 
RIS (2 + qe + Te) cosh(pe/T) — E$ sinh(j1./T)| e (1 tan (10.164) 


For the purposes of calculating how the cosmic scale factor a changes with temperature T during electron- 
positron annihilation, it suffices to approximate the electron chemical potential as zero, He œ% 0, since before 
annihilation, when electrons and positrons are relativistic, the chemical potential is much less than the 
temperature, He/T < 1, and after annihilation electrons (and positrons) contribute little to the entropy, 
and the value of the chemical potential ceases to make much difference. Thus the effective entropy-weighted 
number g, of photons, electrons, and positrons is approximately 

Me 


7 7 3/2 
gs 24 = (1440+ Te) eme/T (14 cE) . (10.165) 


Inserting the expression (10.165) into equation (10.163) yields the cosmic scale factor a in terms of temper- 
ature T through electron-positron annihilation. 

An expression for chemical potential pe is needed to calculate the number densities of electrons and 
positrons through electron-positron annihilation. The chemical potential can be deduced from conservation 
of the comoving difference a? (ne — ne) in the number densities of electrons and positrons. 

The approximation (10.159a), coupled with the thermodynamic equilibrium condition fj = —j, implies 
that the difference n — ñn between the number densities of particles and antiparticles in thermodynamic 
equilibrium approximates 

3 
n-nx x sinh (£) en M/T oh (1 + a” . (10.166) 
In the approximation (10.159a), the constants c) and c) in equation (10.166) are the same as the con- 
stants co and cı given by equations (10.158); but a more accurate approximation for the difference n — ^ñ, 
equation (10.166), uses instead the constants cp and c} defined by, for respectively fermions and bosons, 


3 1/3 
d= E i} 2¢(3) ~ {1.803, 2.404}, c= (5) = {0.785, 0.648} , (10.167) 
0 


with ¢(3) ~ 1.202 the Riemann zeta function. The approximation (10.166) with constants given by equa- 


tions (10.167) is asymptotically correct at both high and low temperatures, and is accurate to better than 
5% at intermediate temperatures. Putting together equations (10.166), (10.163), and (10.165) yields 


Ne 


ae 3/2 
_ 92,0(Me = Re) _ 9509 Sinh (=) e~™e/T) 833 (1 + 0.7852) l (10.168) 
0 JsT-y gs T T 


Ny 
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where 0.833 = 2¢(3)/(a4/45) and 0.785 are relevant constants from equations (10.158) and (10.167). Equa- 
tion (10.168) can be solved for ue/T in terms of temperature T, given the present day value electron-to-photon 
ratio ne/ny|o from equation (10.156), the effective number of degrees of freedom g, from equation (10.165), 
and its present day value gs, = 2. 

With ue/T determined from equation (10.168), comoving number densities a 
T follow from equation (10.159a), along with equations (10.163) and (10.165). 


3n in terms of temperature 


10.28 Maximally symmetric spaces 


By construction, the FLRW metric is spatially homogeneous and isotropic, which means it has maximal 
spatial symmetry. A special subclass of FLRW metrics is in addition stationary, satisfying time translation 
invariance. As you will show in Exercise 10.22, stationary FLRW metrics may have curvature and a cos- 
mological constant, but no other source. You will also show that a coordinate transformation brings such 
FLRW metrics to the explicitly stationary form 

ds? = — (1— $Ar2) dt? + A +r? do? (10.169) 

S66 en ee — LAr? 5 ? 

where the time t, and radius r, are subscripted s for stationary to distinguish them from FLRW time t and 
radius r. 

Spacetimes that are homogeneous, isotropic, and stationary, and are therefore described by the met- 
ric (10.169), are called maximally symmetric. A maximally symmetric space with a positive cosmological 
constant, A > 0, is called de Sitter (dS) space, while that with a negative cosmological constant, A < 0, 
is called anti de Sitter (AdS) space. The maximally symmetric space with zero cosmological constant is 
just Minkowski space. Thanks to their high degree of symmetry, de Sitter and anti de Sitter spaces play a 
prominent role in theoretical studies of quantum gravity. 

de Sitter space has a horizon at radius ry = /3/A. Whereas inside the horizon the time coordinate t, is 
timelike and the radial coordinate r, is spacelike, outside the horizon the time coordinate t, is spacelike and 
the radial coordinate r, is timelike. 

The Riemann tensor, Ricci tensor, Ricci scalar, and Einstein tensor of maximally symmetric spaces are 


Reedy = 3A (Gu p9rv = Irvan) , Rep = AG ’ R=4A ’ Gey = —AGicy . (10.170) 


10.28.1 de Sitter spacetime as a closed FLRW spacetime 


Just as it was possible to conceive the spatial part of the FLRW geometry as a 3D hypersphere embedded in 
4D Euclidean space, §10.6, so also it is possible to conceive a maximally symmetric space as a 4D hyperboloid 
embedded in 5D space. 

For de Sitter space, the parent 5D space is a Minkowski space with metric ds? = — du? + dz? + dy? 4 
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Figure 10.18 Embedding spacetime diagram of de Sitter space, shown on the left in 3D, on the right in a 2D projection 
on to the u-w plane. Objects are confined to the surface of the embedded hyperboloid. The vertical direction is timelike, 
while the horizontal directions are spacelike. The position of a non-accelerating observer defines a “north pole” at r = 0 
and w > 0, traced by the (black) line at the right edge of each diagram. Antipodeal to the north pole is a “south pole” 
at r = 0 and w < 0, traced by the (black) line at the left edge of each diagram. The (reddish) skewed circles on the 
3D diagram, which project to straight lines in the 2D diagram, are lines of constant stationary time ts, labelled with 
their value in units of the horizon radius ry, as measured by the observer at rest at the north pole r = 0. Lines of 
constant stationary time ts transform into each other under a Lorentz boost in the u-w plane. The 45° dashed lines 
are null lines constituting the past and future horizons of the north pole observer (or of the south pole observer). The 
2D diagram on the right shows in addition a sample of (bluish) timelike geodesics that pass through u = w = 0 (these 
lines are omitted from the 3D diagram). 


dz? + dw”, and the embedded 4D hyperboloid is a set of points 


— u? +r? +y? +27+ uw? = rå =constant , (10.171) 


[3 | 3 
=4f/—-= . 10.172 
rH A BTA (10 72) 


The de Sitter hyperboloid is illustrated in Figure 10.18. Let r = (x? + y? + 2?)!/?, and introduce the boost 
angle w and rotation angle x defined by 


with ry the horizon radius 


u = ry sinhy , (10.173a) 
r = ry coshy sinx , (10.173b) 
w = ry cosh Y% cosy . (10.173c) 


The radius r defined by equation (10.173b) is the same as the radius rs in the stationary metric (10.169). In 
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terms of the angles Y and x, the metric on the de Sitter 4D hyperboloid is 
ds” = ry, [— dy? + cosh? (dx? + sin?x do”)] . (10.174) 


The metric (10.174) is of FLRW form (10.28) with t = ruy and a(t) = rgcoshy and a closed spatial 
geometry. The de Sitter space describes a spatially closed FLRW universe that contracts, reaches a minimum 
size at t = 0, then reexpands. Comoving observers, those with x = constant and fixed angular position, move 
vertically upward on the embedded hyperboloid in Figure 10.18. 

The spatial position at r = 0 and w > 0 defines a “north pole” of de Sitter space. Antipodeal to the north 
pole is a “south pole” at r = 0 and w < 0. The surface u = w is a future horizon for an observer at the north 
pole, and a past horizon for an observer at the south pole. Similarly the surface u = —w is a past horizon for 
an observer at the north pole, and a future horizon for an observer at the south pole. The causal diamond 
of any observer is the region of spacetime bounded by the observer’s past and future horizons. The north 
polar observer’s causal diamond is the region w > |u|, while the south polar observer’s causal diamond is 
the region w < —|u|. 

The radial coordinate r is spacelike within the causal diamonds of either the north or south polar observers, 
where |w| > |u|, but timelike outside those causal diamonds, where |w] < |u]. 

The de Sitter hyperboloid possesses a symmetry under Lorentz boosts in the u-w plane. The time t, in 
the stationary metric (10.169) is, modulo a factor of ry, the boost angle of this Lorentz boost, which is 


ma { ryatanh(u/w) |w| > ul, (10.175) 


ryatanh(w/u) |w| < |ul . 


The stationary time coordinate ts is timelike inside the causal diamonds of either the north or south pole 
observers, |w| > |u|, but spacelike outside those causal diamonds, |w] < |u|. 


10.28.2 de Sitter spacetime as an open FLRW spacetime 


An alternative coordinatization of the same embedded hyperboloid (10.171) for de Sitter space yields a metric 


in FLRW form but with an open spatial geometry. Let r = (x? + y? + z?)!/? as before, and define w and x 
by 
u = ry sinhw cosh x , (10.176a) 
r = ry sinh Y sinb x , (10.176b) 
w = ry coshy . (10.176c) 


The r defined by equation (10.176b) is not the same as the rs in the stationary metric (10.169); rather, it is 
w that equals rs. In terms of the angles y and x defined by equations (10.176), the metric on the de Sitter 
4D hyperboloid is 


ds? = rå |- dy? + sinh? y (dy? + sinh?y do”) i (10.177) 


The metric (10.177) is in FLRW form (10.28) with t = ruy and a(t) = rysinhy and an open spatial 
geometry. Whereas the coordinates {w,y}, equation (10.173), for de Sitter with closed spatial geometry 
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Figure 10.19 Penrose diagram of de Sitter space. The left and right edges are identified. The topology is that of a 
3-sphere in the horizontal (spatial) direction times the real line in the vertical (time) direction. The thick (pink) null 
lines are past and future horizons for observers who follow (vertical) geodesics at the “north” and “south” poles at 
r = 0, marked N and S. The approximately horizontal and vertical contours are contours of constant stationary time 
ts and radius rs in the stationary form (10.169) of the de Sitter metric. The contours are uniformly spaced by 0.4 
in ts/ry and the tortoise coordinate r3/ry, equation (10.180). The stationary coordinates ts and rs are respectively 
timelike (vertical) and spacelike (horizontal) inside the causal diamonds of the north and south pole observers, but 
switch to being respectively spacelike and timelike outside the causal diamonds, in the lower and upper wedges. 
The lower and upper wedges correspond to the open FLRW version (10.177) of the de Sitter metric. In the lower 
wedges, comoving observers collapse to a Big Crunch where their future horizons converge, while in the upper wedges, 
comoving observers expand away from a Big Bang from which their past horizons diverge. 


cover the entire embedded hyperboloid shown in Figure 10.18, the coordinates {7, x}, equation (10.176), for 
de Sitter with open spatial geometry cover only the region of the hyperboloid with |u| > |r| and w > ry. 
The region of positive cosmic scale factor, Y% > 0, corresponds to u > 0. Conceptually, for de Sitter with 
open spatial geometry, there is a Big Bang at {u,r,w} = {0,0, 1}ry, comoving observers from which fill the 
region u > |r| and w > ry. Comoving observers, those with x = constant, follow straight lines in the u—r 
plane, bounded by the null cone at u = |r|. 

In the open FLRW metric (10.177) for de Sitter space, the coordinates ts and rs of the stationary met- 
ric (10.169) are respectively spacelike and timelike. Lines of constant stationary time ts, equation (10.175), 
coincide with geodesics of comoving observers, at constant x, while lines of constant stationary radius r, = w, 
equation (10.176c), coincide with lines of constant FLRW time Y, 


ts/TH =X, (10.178a) 
rs/Ty = w/ry = cosh . (10.178b) 


10.28.3 Penrose diagram of de Sitter space 


Figure 10.19 shows a Penrose diagram of de Sitter space. A natural choice of Penrose coordinates comes 
from requiring that vertical lines on the embedded de Sitter hyperboloid 10.18 become vertical lines on the 
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Penrose diagram. These vertical lines are geodesics for comoving observers, lines of constant x, in the closed 
FLRW form (10.174) form of the de Sitter metric. The corresponding Penrose time coordinate tp follows 
from solving for the radial null geodesics of the metric (10.174), whence tp = f dy/coshy. The resulting 
Penrose coordinates for de Sitter space are 


tp = atan(sinh Y) = atan(u/ry) , (10.179a) 
rp = x =atan(r/w) . (10.179b) 


The radial coordinate r in both the closed and open FLRW forms (10.174) and (10.177) of the de Sitter 
metric was chosen so that a comoving observer at the origin was at r = 0, at either the north or the south 
pole. The Penrose diagram 10.19 depicts both closed and open FLRW geometries, but the open geometry is 
shifted by 90° to the equator, so that it appears to interleave with the closed geometry instead of overlapping 
it. The thick (pink) null lines at 45° outline the causal diamonds of north and south polar observers in the 
closed FLRW geometry. The null lines also outline the causal wedges of equatorial observers in the open 
FLRW geometry. The lower wedges correspond to collapsing spacetimes that terminate in a Big Crunch 
where the null lines cross. The upper wedges correspond to expanding spacetimes that begin in a Big Bang 
where the null lines cross. Note that the causal diamonds of any non-accelerating observer are spherically 
symmetric about the observer. Thus the causal diamonds of the closed and open observers touch only along 
one-dimensional lines, not along three-dimensional hypersurfaces as the Penrose diagram might suggest. The 
causal diamonds of observers in de Sitter and anti de Sitter spacetimes are different for different observers, 
and there is no reason to expect that the spacetime could be tiled fully by the causal diamonds of some set 
of observers. 

There is no physical singularity, no divergence of the Riemann tensor, at the Big Crunch and Big Bang 
points of the collapsing and expanding open FLRW forms of the de Sitter geometry. Does that mean that the 
collapsing de Sitter spacetime evolves smoothly into an expanding spacetime? As long as the spacetime is 
pure vacuum, there is no way to tell whether spacetime is expanding or collapsing. Only when the spacetime 
contains matter of some kind, as our Universe does, can a preferred set of comoving coordinates be defined. 
When matter is present, Big Crunches and Big Bangs are, setting aside quantum gravity, genuine singularities 
that cannot be removed by a coordinate transformation. 

The horizontal and vertical contours in the Penrose diagram 10.19 are contours of constant stationary time 
ts and radius rs. Translation in ts is a symmetry of de Sitter spacetime, and to exhibit this symmetry, the 
contours of ts in the Penrose diagram are chosen to be uniformly spaced. A similarly symmetric appearance 
for the radial coordinate is achieved by choosing contours of rs to be uniformly spaced in the tortoise 
coordinate rž, 


. drs 
r= J {ape = ryatanh(rs/rp) . (10.180) 


The contours in the Penrose diagram 10.19 are uniformly spaced by 0.4 in ts/ry and rž/ry. In terms of the 
time and tortoise coordinates t, and rž, the Penrose time and radial coordinates are 


tir. 
tp trp = atan sinh ( ls )| l (10.181) 


TH 
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Figure 10.20 Embedding spacetime diagram of anti de Sitter space, shown on the left in 3D, on the right in a 2D 
projection on to the v-r plane. The vertical direction winding around the hyperboloid is timelike, while the horizontal 
direction is spacelike. The position of a non-accelerating observer defines a spatial “pole” at r = 0. In the 3D diagram, 
the (red) horizontal line is an example line of constant stationary time ts for the observer at the pole. Lines of constant 
stationary time ts transform into each other under a rotation in the u-v plane. The (bluish) lines at less than 45° 
from vertical are a sample of geodesics that pass through the pole at r = 0 at time ~ = 0. In anti de Sitter space, all 
timelike geodesics that pass through a spatial point boomerang back to the spatial point in a proper time ary. The 
2D diagram on the right shows in addition (reddish) lines of constant stationary time ts for observers on the various 
geodesics. 


10.28.4 Anti de Sitter space 


For anti de Sitter space, the parent 5D space is a Minkowski space with signature -—+++, metric ds? = 
— du? — dv? + dx? + dy? + dz’, and the embedded 4D hyperboloid is a set of points 


— wav +r? +y +27 = r = constant , (10.182) 


with ry = \/—3/A. The anti de Sitter hyperboloid is illustrated in Figure 10.20. Let r = (a? + y? + 22)1/?, 
and introduce the boost angle x and rotation angle w defined by 


u = ry coshy cosy , (10.183a) 
v = ry cosh y sing , (10.183b) 
r= ry sinh x . (10.183¢) 


The time coordinate Ņ defined by equations (10.183) appears to be periodic, with period 27, but this is an 
artefact of the embedding. In a causal spacetime, the time coordinate would not loop back on itself. Rather, 
the coordinate Ņ% can be taken to increase monotonically as it loops around the hyperboloid, extending from 
—oo to oo. In terms of the angles Y% and x, the metric on the anti de Sitter 4D hyperboloid is 


ds? = rz, (- cosh? y dy? + dx? + sinh? x do”) ‘ (10.184) 
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The metric (10.184) is of stationary form (10.169) with ts = ruy and rs = ry sinh x. 


10.28.5 Anti de Sitter spacetime as an open FLRW spacetime 


An alternative coordinatization of the same embedded hyperboloid 10.20 for anti de Sitter space, 


u = ry COSY , (10.185a 
v = ry sinw cosh x , (10.185b 
r = ry sin 4 sinh x . (10.185c 


yields a metric in FLRW form with an open spatial geometry, 


ds? = rå [- dy? + sin?°y (dx? + sinh’ y do”)] . (10.186 


Whereas the coordinates (10.183) cover all of the anti de Sitter hyperboloid 10.20, the open coordinates (10.185) 
cover only the regions with |u| < ry. These are the upper and lower diamonds bounded by the (pink) dashed 
null lines in the hyperboloid 10.20. In each diamond, the open spacetime undergoes a Big Bang at the earliest 
vertex of the diamond, expands to a maximum size, turns around, and collapses to a Big Crunch at the latest 
vertex of the diamond. 


10.28.6 Anti de Sitter spacetime as a Rindler space 


Anti de Sitter spacetime possesses symmetry under Lorentz boosts in any time-space plane, such as the 
v-x plane. In the open FLRW form (10.186) of anti de Sitter geometry, such boosts transform geodesics 
of comoving observers into each other. Outside the open causal diamonds on the other hand, these boosts 
generate the worldlines of a certain set of “Rindler” observers who accelerate with constant acceleration in 
the v-z plane. Rindler time and space coordinates {, w} are defined by 


u = ry coshy , (10.187a) 
v = ry sinh 4 sinh x , (10.187b) 
x = ry sinh% cosh x , (10.187c) 
yielding the AdS Rindler metric 
ds? = rj; (— sinh? dx? + dp?) + dy? + dz? . (10.188) 


10.28.7 Penrose diagram of anti de Sitter space 


Figure 10.21 shows a Penrose diagram of anti de Sitter space. A natural choice of Penrose coordinates comes 
from requiring that horizontal lines on the embedded anti de Sitter hyperboloid 10.20 become horizontal 
lines on the Penrose diagram. These horizontal lines are lines of constant stationary time ts = ryw in the 
stationary form (10.184) form of the anti de Sitter metric. The corresponding Penrose radial coordinate rp 
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Figure 10.21 Penrose diagram of anti de Sitter space. The diagram repeats vertically indefinitely. The topology is 
that of Euclidean 3-space in the horizontal (spatial) direction times the real line in the vertical (time) direction. The 
thick (pink) null lines outline the causal diamonds of observers in the open FLRW form (10.186) of the anti de Sitter 
spacetime. The spacetime of the open FLRW geometry expands from a Big Bang at a crossing point of the null lines, 
and collapses to a Big Crunch at the next crossing point. The thick null lines also outline the causal wedges of Rindler 
observers, equation (10.188), at the left and right edges of the diagram. The approximately horizontal and vertical 
contours are lines of constant ~ and x, uniformly spaced by 0.4 in x and %*, equation (10.191), both in the open 
FLRW diamonds and in the left and right Rindler wedges, equations (10.186) and (10.188). The coordinates ~ and 
x are respectively timelike (vertical) and spacelike (horizontal) in the open diamonds, and respectively spacelike and 
timelike in the Rindler wedges. 


follows from solving for the radial null geodesics of the metric (10.184), whence rp = f dy/coshy. The 
resulting Penrose coordinates for de Sitter space are 


tp = Y = ts/ru 5 (10.189a) 
rp =atan(sinhy) =ri/ry , (10.189b) 


where rz is the tortoise radial coordinate 


a drs 
Ta = / T+72/rz, = rH atan(rs/rH) . (10.190) 


Thus, for anti de Sitter, lines of constant time ts and radius rs in the stationary metric (10.169) correspond 
also to lines of constant Penrose time and radius tp and rp. 


286 Homogeneous, Isotropic Cosmology 


The thick (pink) null lines in the Penrose diagram 10.21 outline the causal diamonds of comoving observers 
in the open FLRW (10.186) form of the anti de Sitter metric. The null lines also outline the causal wedges 
of Rindler observers in the Rindler (10.188) form of the anti de Sitter metric. 

The approximately horizontal and vertical contours in the Penrose diagram 10.21 are lines of constant w 
and x in both the open FLRW (10.186) and Rindler (10.188) forms of the anti de Sitter metric. In the open 
FLRW causal diamonds, the horizontal lines are lines of constant cosmic time Y, while the vertical contours 
are geodesics, lines of constant x. In the Rindler causal wedges, the horizontal contours are lines of constant 
boost angle x, while the vertical contours are worldlines of Rindler observers, lines of constant w. 

Anti de Sitter space is symmetric under boosts in the v-z plane, corresponding to translations of the 
coordinate x in either of the open FLRW (10.186) or Rindler (10.188) forms of the anti de Sitter metric. The 
contours in the Penrose diagram 10.21 are uniformly spaced in x by 0.4 so as to manifest this symmetry. 
A similarly symmetric appearance for the w~ coordinate is achieved by choosing contours to be uniformly 
spaced by 0.4 in the tortoise coordinate ~* 


ay = Intan(~/2) open , 
G vy (10.191) 
J sinh w = ln tanh(4/2) Rindler . 


Exercise 10.22. Maximally symmetric spaces. 
1. Argue that in a stationary spacetime, every scalar quantity must be independent of time. In particular, 
the Riemann scalar R, and the contracted Ricci product R”” R,,, must be independent of time. Conclude 
that the density p and pressure p of a stationary FLRW spacetime must be constant. 


2. Conclude that a stationary FLRW spacetime may have curvature and a cosmological constant, but no 
other source. Show that the FLRW metric then takes the form (10.28) with cosmic scale factor 
Hot Q,=1, XA =0, 
exp(Hot) Q, =0, W=1, 
alt) =} /—Ox/Qa cosh(/OAHot) <0, A>0, (10.192) 
J Ox /Qa sinh(V/O4 Hot) Q, >0, QA >0, 
V—9k/Na sin(/—Qp Hot) Q >0, Aa <0, 


with 
Qa Hi = $A, QH =K. (10.193) 


As elsewhere in this chapter, H = Hp at a = 1, and the Q’s sum to unity, Q, + Qa = 1. 
3. Show that the FLRW metric transforms into the explicitly stationary form (10.169) under a coordinate 


10.28 Mazimally symmetric spaces 287 
transformation to proper radius rs = a(t)x and stationary time ts given by 


1— Kkr? t Qk=1, Qa=0, 
1 
t= P an Q =0, %=1, 
1 
ts = Tae th [v 1 — kz? coth (v Na Hot) | Ox <0, QA >0, (10.194) 


aii atanh [v 1 — «x? tanh (V2aHot) | Q,>0, A >0, 
0 


1 
V—-O), Ho 


Note that in all cases ts = t at the origin rs = 0. 


atan | 1 — kz? tan (V =91 Hot) | OQ, >0, Qa<0. 


Concept question 10.23. Milne Universe. In Exercise 10.22 you found that the FLRW metric for an 
open universe with zero energy-momentum content (Q, = 1, Qa = 0), also known as the Milne metric, is 
equivalent to flat Minkowski space. How can an open universe be equivalent to flat space? Draw a spacetime 
diagram of Minkowski space showing (a) worldlines of observers at constant comoving FLRW position zx, 
and (b) hypersurfaces of constant FLRW time t. 


Concept question 10.24. Stationary FLRW metrics with different curvature constants describe 
the same spacetime. How can it be that stationary FLRW metrics with different curvature constants «K 
(but the same cosmological constant A) describe the same spacetime? 


PART TWO 


TETRAD APPROACH TO GENERAL RELATIVITY 


$0890: NO? (OE egs 


10. 


11. 


Concept Questions 


. The vierbein has 16 degrees of freedom instead of the 10 degrees of freedom of the metric. What do the 


extra 6 degrees of freedom correspond to? 
Tetrad transformations are defined to be Lorentz transformations. Don’t general coordinate transfor- 
mations already include Lorentz transformations as a particular case, so aren’t tetrad transformations 
redundant? 
What does coordinate gauge-invariant mean? What does tetrad gauge-invariant mean? 
Is the coordinate metric g,,, tetrad gauge-invariant? 
What does a directed derivative ôm mean physically? 
Is the directed derivative m coordinate gauge-invariant? 
Is the tetrad metric ym, coordinate gauge-invariant? Is it tetrad gauge-invariant? 
What is the tetrad-frame 4-velocity u™ of a person at rest in an orthonormal tetrad frame? 
If the tetrad frame is accelerating (not in free-fall), which of the following is true/false? 

a. Does the tetrad-frame 4-velocity u™ of a person continuously at rest in the tetrad frame change 

with time? Oou™ = 0? Dou” = 0? 

b. Do the tetrad axes ym change with time? 09%m = 0? DoYm = 0? 

c. Does the tetrad metric Yn, change with time? boYmn = 0? Doymn = 0? 

d. Do the covariant components Um of the 4-velocity of a person continuously at rest in the tetrad 

frame change with time? boum = 0? Doum = 0? 

Suppose that p = Ymp” is a 4-vector. Is the proper rate of change of the proper components p™ measured 
by an observer equal to the directed time derivative Opp” or to the covariant time derivative Dop™? 
What about the covariant components pm of the 4-vector? [Hint: The proper contravariant components 
of the 4-vector measured by an observer are p™ = -y™- p where y™ are the contravariant locally inertial 
rest axes of the observer. Similarly the proper covariant components are Pm = Ym ` p.| 
A person with two eyes separated by proper distance £” observes an object. The observer observes the 
photon 4-vector from the object to be p™. The observer uses the difference dp™ in the two 4-vectors 
detected by the two eyes to infer the binocular distance to the object. Is the difference ôp™ in photon 
4-vectors detected by the two eyes equal to the directed derivative d€"0,p™ or to the covariant derivative 
6&"Dyp™? 
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12. Suppose that p™ is a tetrad 4-vector. Parallel-transport the 4-vector by an infinitesimal proper distance 
ôE”. Is the change in p™ measured by an ensemble of observers at rest in the tetrad frame equal to 
the directed derivative d€"0,p™ or to the covariant derivative d€"D,p™? [Hint: What if “rest” means 
that the observer at each point is separately at rest in the tetrad frame at that point? What if “rest” 
means that the observers are mutually at rest relative to each other in the rest frame of the tetrad at 
one particular point?| 

13. What is the physical significance of the fact that directed derivatives fail to commute? 

14. Physically, what do the tetrad connection coefficients [ym mean? 

15. What is the physical significance of the fact that Tym» is antisymmetric in its first two indices (if the 
tetrad metric ymn is constant)? 

16. Are the tetrad connections Cy, coordinate gauge-invariant? 


What’s important? 


This chapter describes the tetrad formalism of general relativity. 


1. 
2. 


Why tetrads? Because physics is clearer in a locally inertial frame than in a coordinate frame. 

The primitive object in the tetrad formalism is the vierbein e™,,, in place of the metric in the coordinate 
formalism. 

Written suitably, for example as equation (11.9), a metric ds? encodes not only the metric coefficients 
Juv, but a full vierbein e”, through ds? = mn e™ pdz" e” dz”. 

The tetrad road from vierbein to energy-momentum is similar to the coordinate road from metric to 
energy-momentum, albeit a little more complicated. 

In the tetrad formalism, the directed derivative ôm is the analogue of the coordinate partial deriva- 
tive 0/Ox" of the coordinate formalism. Directed derivatives m do not commute, whereas coordinate 
derivatives 0/02" do commute. 
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11 


The tetrad formalism 


11.1 Tetrad 


A tetrad (greek foursome) Ym(zx) is a set of axes 


Ym = (0: V1: V2 V3} (11.1) 


attached to each point x” of spacetime. The common case, illustrated in Figure 11.1, is that of an orthonor- 
mal tetrad, where the axes form a locally inertial frame at each point, so that the dot products of the axes 
constitute the Minkowski metric Nmn 


Ym ` Yn = Nmn - (11.2) 


However, other tetrads prove useful in appropriate circumstances. There are spin tetrads, null tetrads (notably 
the Newman-Penrose double null tetrad), and others (indeed, the basis of coordinate tangent vectors e,, is 


Yo 
Yı 


Figure 11.1 Tetrad vectors ym form a basis of vectors at each point. A common choice, depicted here, is for the 
basis vectors Ym to form an orthonormal set, meaning that their dot products constitute the Minkowski metric, 
Ym: Yn = Nmn, at each point. The orthonormal frames at neighbouring points need not be aligned with each other by 
parallel transport, and indeed in curved spacetime it is impossible to choose orthonormal frames that are everywhere 
aligned. 
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a tetrad). In general, the tetrad metric is some symmetric matrix Ymn 


Ym Yn = Ymn | - (11.3) 
The convention in this book is that latin (black) indices label tetrad frames, while greek (brown) indices 
label coordinate frames. 

Why introduce tetrads? 

1. The physics is more transparent when expressed in a locally inertial frame (or some other frame adapted 


to the physics), as opposed to the coordinate frame, where Salvador Dali rules. 

2. If you want to consider spin-4 particles and quantum physics, you better work with tetrads. 

3. For good reason, much of the general relativistic literature works with tetrads, so it’s useful to understand 
them. 


11.2 Vierbein 


The vierbein (German four-legs, or colloquially, critter) e™”, is defined to be the matrix that transforms 
between the tetrad frame and the coordinate frame (note the placement of indices: the tetrad index m comes 
first, then the coordinate index ,) 


en = eo Ym | - (11.4) 


The letter e stems from the German word einheit for unity. The vierbein is a 4x 4 matrix, with 16 independent 
components. The inverse vierbein em” is defined to be the matrix inverse of the vierbein e,,, so that 


emt ey =F, em” eu = 6%. (11.5) 


Thus equation (11.4) inverts to 


Ym = em" €n | - (11.6) 


11.3 The line-element encodes the vierbein 


The scalar spacetime distance is 
ds? = guv dx” dx” = e,,- e, dx” dz” = Ymn €” p €” y dz” dx” (11.7) 


from which it follows that the coordinate metric g,» is 


Juv = Ymn any ey : (11.8) 


The shorthand way in which line-elements are commonly written encodes not only a metric but also a 
vierbein, hence a tetrad. For example, the Schwarzschild line-element 


—1 
ds? = — (1 - =m) dt? + (1 - =m) dr? + r7d6? + r? sin?6 dd? (11.9) 
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takes the form (11.7) with an orthonormal (Minkowski) tetrad metric Ymn = nmn, and a vierbein encoded 
in the differentials (one-forms, §15.6) 


2M\ "P 
e°, dx” = (1 — zn dt , (11.10a 
r 
2M\ 
e! dx” = (1 - =m) dr , (11.10b 
T 
edr” =rd0, (11.10c 
e? dx" = rsin dọ , (11.10d 
Explicitly, the vierbein of the Schwarzschild line-element is the diagonal matrix 
(1—2M/r)\/? 0 0 0 
m 0 (1—2M/r)-¥? 0 0 
= o i ee , (11.11) 
0 0 0 rsin 


and the corresponding inverse vierbein is (note that, because the tetrad index is always in the first place and 
the coordinate index is always in the second place, the matrices as written are actually inverse transposes of 
each other, not just inverses) 


(1—2M/r)-1/? 0 0 0 
0 (1—2M/r)/2 0 0 
H 
eae = i A i/r i (11.12) 
0 0 0 1/(rsin@) 


Concept question 11.1. Schwarzschild vierbein. The components e°; and et, of the Schwarzschild 
vierbein (11.11) are imaginary inside the horizon. What does this mean? Is the vierbein still valid inside the 
horizon? 


11.4 Tetrad transformations 


Tetrad transformations are transformations that preserve the fundamental property of interest, for example 
the orthonormality, of the tetrad. For most tetrads considered in this book, which includes not only orthonor- 
mal tetrads, but also spin tetrads and null tetrads (but not coordinate-based tetrads), tetrad transformations 
are Lorentz transformations. The Lorentz transformation may be, and usually is, a different transformation 
at each point. Tetrad transformations rotate the tetrad axes -y, at each point by a Lorentz transformation 
L;,™, while keeping the background coordinates x” unchanged: 


Yk ` Yk = Ly” Ym | + (11.13) 
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In the case that the tetrad axes yz are orthonormal, with a Minkowski metric, the Lorentz transformation 
matrices L;,”” in equation (11.13) take the familiar special relativistic form, but the linear matrices L;,” in 
equation (11.13) signify a Lorentz transformation in any case. 

For orthonormal, spin, and null tetrads, the tetrad metric ym, is constant. Lorentz transformations are 
precisely those transformations that leave the tetrad metric unchanged 


Yki = Yk a7 = Ly Li” Ym Yn = Lg” Li” Ymn = Ykl - (11.14) 


Exercise 11.2. Generators of Lorentz transformations are antisymmetric. From the condition that 
the tetrad metric yz; is unchanged by a Lorentz transformation, show that the generator of an infinitesimal 
Lorentz transformation is an antisymmetric matrix. Is this true only for an orthonormal tetrad, or is it true 
more generally? 

Solution. An infinitesimal Lorentz transformation is the sum of the unit matrix and an infinitesimal piece 
AL”, the generator of the infinitesimal Lorentz transformation, 


Lk™ = p + AL,” . (11.15) 
Under such an infinitesimal Lorentz transformation, the tetrad metric transforms to 
Yer = (OR + ALK” (0r + ALi”) Ymn © Yet + ALe + ALix , (11.16) 
which by proposition equals the original tetrad metric ypı, equation (11.14). It follows that 
ALga + ALR, =0, (11.17) 


that is, the generator AL, is antisymmetric, as claimed. The result is true whenever the tetrad metric is 
invariant under Lorentz transformations. 


11.5 Tetrad vectors and tensors 


Just as coordinate vectors (and tensors) were defined in §2.8 as objects that transformed like (tensor products 
of) coordinate intervals under coordinate transformations, so also tetrad vectors (and tensors) are defined 
as objects that transform like (tensor products of) tetrad vectors under tetrad (Lorentz) transformations. 


11.5.1 Covariant tetrad 4-vector 


A tetrad (Lorentz) transformation transforms the tetrad axes yp in accordance with equation (11.13). A 
covariant tetrad 4-vector is defined to be a quantity A, = {Ap, A1, A2, A3} that transforms under a tetrad 
transformation like the tetrad axes, 


Ak > Ay = Lk” Am - (11.18) 
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11.5.2 Lowering and raising tetrad indices 


Just as the indices on a coordinate vector or tensor were lowered and raised with the coordinate metric guv 
and its inverse g"”, §2.8.3, so also indices on a tetrad vector or tensor are lowered and raised with the tetrad 
metric Ymn and its inverse ~™”, defined to satisfy 


Vem Yo" = OF - (11.19) 


In the tetrads considered in this book (Minkowski, spin, or Newman-Penrose tetrad), the components of the 
tetrad metric and its inverse are numerically equal, Ymn = y'"", but this need not be the case in general. 

The contravariant (raised index) components A” and covariant (lowered index) components Am of a tetrad 
vector are related by 


A™ =7™"A,, Am =Y mn A" e (11.20) 
The dual tetrad basis vectors y™ are defined by 
=h. (11.21) 
By construction, dot products of the dual and tetrad basis vectors equal the unit matrix, 
A if = On" y (11.22) 
while dot products of the dual basis vectors with each other equal the inverse tetrad metric, 


y™ : y” = oi . (11.23) 


11.5.3 Contravariant tetrad vector 


A contravariant tetrad 4-vector A* transforms under a tetrad transformation as, analogously to equa- 
tion (11.18), 


AF SAP = LE A, (11.24) 


where L? m is the Lorentz transformation inverse to L™. Equation (11.14) implies that Lorentz transforma- 
tion matrices with indices variously lowered and raised satisfy 


Lr” L ma = Lren b =La b =L ee Ses (11.25) 


11.5.4 Abstract vector 


A 4-vector can be written in a coordinate- and tetrad- independent fashion as an abstract 4-vector A, 
AÅ = YmA” =e,A* . (11.26) 


Although A is a 4-vector, it is by construction unchanged by either a coordinate transformation or a tetrad 
transformation, and is therefore, according to the naming convention adopted in this book, §11.6, both a 
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coordinate scalar and a tetrad scalar. The coordinate and tetrad components of the 4-vector A are related 
by the vierbein, 


A, =e", Am, Am =@ml'Ay « (11.27) 


11.5.5 Scalar product 


The scalar product of two 4-vectors may be A and B may be written variously 
A: B = AmB” = A,B". (11.28) 


The scalar product is a scalar, unchanged by either a coordinate or tetrad transformation. 


11.5.6 Tetrad tensor 


In general, a tetrad-frame tensor Afl- is an object that transforms under tetrad (Lorentz) transforma- 
tions (11.13) as 


Ae. ETP L ye Dm Opn a AA (11.29) 


11.6 Index and naming conventions for vectors and tensors 


In the tetrad formalism tensors can be coordinate tensors, or tetrad tensors, or mixed coordinate-tetrad 
tensors. For example, the vierbein e”™,, is itself a mixed coordinate-tetrad tensor. 

The convention in this book is to distinguish the various kinds of vector and tensor with an adjective, and 

by its index: 

1. A coordinate vector A”, with a brown greek index, is one that changes in a prescribed way under 
coordinate transformations. A coordinate transformation is one that changes the coordinates x” of the 
spacetime without actually changing the spacetime or whatever lies in it. A coordinate vector A” does 
not change under a tetrad transformation, and is therefore a tetrad scalar. 

2. A tetrad vector A™ with a black latin index, is one that changes in a prescribed way under tetrad 
transformations. A tetrad transformation Lorentz transforms the tetrad axes 7,,, at each point of the 
spacetime without actually changing the spacetime or whatever lies in it. A tetrad vector A™ does not 
change under a coordinate transformation, and is therefore a coordinate scalar. 

3. An abstract vector A, identified by boldface, is the thing itself, and is unchanged by either the choice 
of coordinates or the choice of tetrad. Since the abstract vector is unchanged by either a coordinate 
transformation or a tetrad transformation, it is a coordinate and tetrad scalar, and has no indices. 

All the types of vector have the properties of linearity (additivity, multiplication by scalars) that identify 
them mathematically as belonging to vector spaces. The important distinction between the types of vector 
is how they behave under transformations. 


300 The tetrad formalism 


Just because something has a coordinate or tetrad index does not make it a coordinate or tetrad tensor. If 
however an object is a coordinate and/or tetrad tensor, then its indices are lowered and raised as follows: 
1. Lower and raise coordinate indices with the coordinate metric g,,, and its inverse g!”; 


2. Lower and raise tetrad indices with the tetrad metric ym, and its inverse y™”; 


3. Switch between coordinate and tetrad frames with the vierbein e™,, and its inverse em”. 


11.7 Gauge transformations 


Gauge transformations are transformations of the coordinates or tetrad. Such transformations do not 
change the underlying spacetime. 

Quantities that are unchanged by a coordinate transformation are coordinate gauge-invariant (coor- 
dinate scalars). Quantities that are unchanged under a tetrad transformation are tetrad gauge-invariant 
(tetrad scalars). For example, tetrad tensors are coordinate gauge-invariant, while coordinate tensors are 
tetrad gauge-invariant. 

Tetrad transformations have the 6 degrees of freedom of Lorentz transformations, with 3 degrees of freedom 
in spatial rotations, and 3 more in Lorentz boosts. General coordinate transformations have 4 degrees of 
freedom. Thus there are 10 degrees of freedom in the choice of tetrad and coordinate system. The 16 degrees 
of freedom of the vierbein, minus the 10 degrees of freedom from the transformations of the tetrad and 
coordinates, leave 6 physical degrees of freedom in spacetime, the same as in the coordinate approach to 
general relativity, which is as it should be. 


11.8 Directed derivatives 


Directed derivatives ôm are defined to be the directional derivatives along the axes Ym 


ð 
Om = Ym O = Ym: T = em” =— | a tetrad 4-vector . (11.30) 
y 


The directed derivative m is independent of the choice of coordinates, as signalled by the fact that it has 
only a tetrad index, no coordinate index. 
Unlike coordinate derivatives 0/Ox", directed derivatives m do not commute. Their commutator is 


ð ð 
= eo ae 
[Om On] = |€m Alt > en Bat 
i de,” 0 y Oly" 0 
= em En 
Ox Ox” Ox’ xt 


II 


(— dE m + dk) 0, not a tetrad tensor (11.31) 


11.9 Tetrad covariant derivative 301 


where dimn = Vik as is the inverse vierbein derivative 
Cm” 


Ox” 


Since the vierbein and inverse vierbein are inverse to each other, an equivalent definition of dimn in terms of 


dimn = —Y1k e" k en” not a tetrad tensor . (11.32) 


the vierbein is 


pe, 
Ox’ 


dimn = Yik Em” en” not a tetrad tensor . (11.33) 


The vierbein derivatives dimn are also known as Ricci rotation coefficients (or, in the context of Newman- 
Penrose tetrads, spin coefficients). 


11.9 Tetrad covariant derivative 


The derivation of tetrad covariant derivatives Dm follows precisely the analogous derivation of coordinate 
covariant derivatives D,,. The tetrad-frame formulae look entirely similar to the coordinate-frame formulae, 
with the replacement of coordinate partial derivatives by directed derivatives, 0/Ox — Om, and the re- 
placement of coordinate-frame connections by tetrad-frame connections Ti, + TE n- There are two things 
to be careful about: first, unlike coordinate partial derivatives, directed derivatives m do not commute; 
and second, neither tetrad-frame nor coordinate-frame connections are tensors, and therefore it should be 
no surprise that the tetrad-frame connections Iimn are not related to the coordinate-frame connections 
Pau by the ‘usual’ vierbein transformations. Rather, the tetrad and coordinate connections are related by 
equation (11.44). 

If ® is a scalar, then 0,,® is a tetrad 4-vector. The tetrad covariant derivative of a scalar is just the directed 
derivative 


Dm® = Omn®| a tetrad 4-vector . (11.34) 


If A™ is a tetrad 4-vector, then 0,A™ is not a tetrad tensor, and 0,A m is not a tetrad tensor. But the 
abstract 4-vector A = Ym A, being by construction invariant under both tetrad and coordinate transfor- 
mations, is a scalar, and its directed derivative is therefore a 4-vector, 


On A = On(%mA™”) a tetrad 4-vector 
= Ymn A” + (OnYm)A™ . (11.35) 


For equation (11.35) to make sense, the derivatives nym must be defined, something that is made possible, 
as in the coordinate approach in §2.9.2, by the postulate of the existence of locally inertial frames. The 
coordinate partial derivative of ym are defined in the usual way by 


Oym m(T?, ..., LY +62", 0, L3) — Ym (P, ..., L”, a0 0° 
Oi i Va ee A ee ae a ie) (11.36) 
OLY 5x"0 bx” 


The right hand of equation (11.36) involves the difference between Ym at two different points x and «+02. 
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n Yo 


5x! 


Figure 11.2 The change ôyo in the tetrad vector yo over a small coordinate interval ôx! of spacetime is defined to be 
the difference between the tetrad vector yo(x! + z!) at the shifted position 2! + da! and the tetrad vector yo(a') 
at the original position x!, parallel-transported to the shifted position. The parallel-transported vector is shown as a 
dashed arrowed line. The parallel transport is defined with respect to a locally inertial frame, shown as a background 
square grid aligned with the tetrad at the unshifted position. 


The difference is to be interpreted as ym(x+ôx) at the shifted point, minus the value of ym(x”) parallel- 
transported from position x to the shifted point x+ôx along the small distance dx between them, as illustrated 
in Figure 11.2. Parallel transport means, go to a locally inertial frame, then move along the prescribed 
direction without boosting or precessing. With the coordinate partial derivatives of the tetrad basis vectors 
so defined, the directed derivatives follow as ôn Ym = €n”OYm/Ox". 

The directed derivatives of the tetrad basis vectors define the tetrad-frame connection coefficients, 
[Tk 


mn? 


OnYm =V en, Ye | not a tetrad tensor . (11.37) 


In the usual case where the tetrad metric is Lorentz invariant and the tetrad connections I';,,,, are therefore 
generators of Lorentz transformations, antisymmetric in their first two indices, Exercise 11.2, I like to call the 


tetrad connection coefficients Lorentz connections. With equation (11.37), equation (11.35) then shows 
that 


On A = yp(Dn A") a tetrad tensor , (11.38) 


where D„A* is the covariant. derivative of the contravariant 4-vector A* 


D, A" = 0,A* +T*,,A™| a tetrad tensor . (11.39 


The covariant derivative of a covariant tetrad 4-vector A, follows similarly from 


On A ="(DyAx) a tetrad tensor , (11.40 


where D,, A, is the covariant. derivative of the covariant 4-vector A, 


Dy, Ax = On Ax —TR,Am| a tetrad tensor . (11.41 


In general, the covariant derivative of a tetrad-frame tensor is 


D AR; = Op Apa +R AG +1) Ar + Php AM —Ta Abe — (11.42) 


qp*>mn... mp*~qn... np*>mq... 
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with a positive [ term for each contravariant index, and a negative I term for each covariant index. 


11.10 Relation between tetrad and coordinate connections 


The relation between the tetrad connections I% „ and their coordinate counterparts r“, follows from 


mn pv 


0 J x Oe™ m 
S =I és m not a tetrad tensor 
Oe™ , OYm 
= m + % 
prr MTS Ppa 
= e” pe”, (dhn PT) Yk (11.43) 


Thus the relation is 


ding Hl peer enl e Ty, | not a tetrad tensor (11.44) 


where 


Dimn = Vik EE y . (11.45) 


11.11 Antisymmetry of the tetrad connections 


The directed derivative of the tetrad metric is 


On Yim = On (1 i Ym) 
= Y1 On Im + Ym nN 
= Fma t Ems (11.46) 


In most cases of interest, including orthonormal, spin, and null tetrads, the tetrad metric is chosen to be a 
constant. For example, if the tetrad is orthonormal, then the tetrad metric is the Minkowski metric, which 
is constant, the same everywhere. If the tetrad metric is constant, then all derivatives of the tetrad metric 
vanish, and then equation (11.46) shows that the tetrad connections are antisymmetric in their first two 
indices 


Dn = ali (11.47) 


This antisymmetry reflects the fact that Timn is the generator of a Lorentz transformation for each n, 
Exercise 11.2. 
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11.12 Torsion tensor 


The torsion tensor S77, which general relativity assumes to vanish, is defined in the usual way, equa- 
tion (2.57), by the commutator of the covariant derivative acting on a scalar ® 


[Dz Di] ® = Ski Om®| a tetrad tensor . (11.48) 


The expression (11.41) for the covariant derivatives coupled with the commutator (11.31) of directed deriva- 
tives shows that the torsion tensor is 


Sp = dy + VR, — dip — Tig] a tetrad tensor , (11.49) 


which is equivalent to the coordinate expression (2.58) for the torsion in view of the relation (11.44) between 
tetrad and coordinate connections. The torsion tensor S77 is antisymmetric in k © l, as is evident from its 
definition (11.48). 


11.13 No-torsion condition 


General relativity assumes vanishing torsion 


Sm =0). (11.50) 


For vanishing torsion, equation (11.49) implies 
dmki +I mki = dmik +I mik not a tetrad tensor , (11.51) 


which is equivalent to the usual symmetry condition T),,,, = Tps on the coordinate frame connections in 
view of the relation (11.44) between tetrad and coordinate connections. 


11.14 Tetrad connections in terms of the vierbein 


In the general case of non-constant tetrad metric, and non-vanishing torsion, the following manipulation, 
from equations (11.46) and (11.49), analogous to the corresponding manipulation (2.61) in the coordinate 
frame, 
On Yim + OmYin = Oimn = Timi + Lin + Tinm + Trim — Dmni — Tami (11.52) 
=2 Pimn F Sinm a Smin =F Snim dinm ar dimn = dmin ar dmnl a dnim TF dami 
implies that the tetrad connections Tj, are given in terms of the derivatives nYım of the tetrad metric, 
the torsion Sjmn, and the vierbein derivatives dimn by 
Timn — $ (OnYim + Om Yin —_ mn + Simn + Smin + Onml 
+ dinin = dimn + Genin = dmnl + datim = dnm) not a tetrad tensor . (11.53) 
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If torsion vanishes, as general relativity assumes, and if furthermore the tetrad metric is constant, then 
equation (11.53) simplifies to the following expression for the tetrad connections in terms of the vierbein 
derivatives dimn defined by (11.33), analogous to the expression (2.63) for coordinate-frame connections in 
terms of coordinate derivatives of the metric, 


Dimn = § (dinm — dimn + dmin — dmni + dnim — dnmi)| not a tetrad tensor . (11.54) 


This is the formula that allows tetrad connections to be calculated from the vierbein. 


11.15 Torsion-free covariant derivative 


As in §2.12, the torsion-free part of the covariant derivative is a covariant derivative even when torsion is 
present. When torsion is present and it is desirable to make the torsion part explicit, it is convenient to 
distinguish torsion-free quantities with a ° overscript. The torsion-full tetrad connection Ty, is a sum of 
the torsion-free (Levi-Civita) connection Tin and the contortion tensor Kimns 


Lian = imn + Kimn ; (11.55) 
where from equation (11.53) the contortion tensor Kımn and the torsion tensor Sımn are related by 


Kimn = 5 (Simn = Smin T Sumi) = Snim F 3 Slimn] a tetrad tensor , (11.56a) 
Simn = Kimn — Kinm = — Kmnt+3Kjimn] a tetrad tensor . (11.56b) 


Like the tetrad connection Pimn, the contortion Kj, is antisymmetric in its first two indices. The torsion-full 
covariant derivative D,, differs from the torsion-free covariant derivative D,, by the contortion, 


D, A® = D, A* + KE,,A™ a tetrad tensor . (11.57) 


In this book the symbol D,, by default denotes the torsion-full covariant derivative. In some places however, 
such as in the theory of differential forms, the symbol D, is used for brevity to denote the torsion-free 
covariant derivative, even in the presence of torsion. When D,, denotes the torsion-free covariant derivative, 
it will be stated so explicitly. 


11.16 Riemann curvature tensor 


The Riemann curvature tensor Rkimn is defined in the usual way, equation (2.110), by the commutator 
of the covariant derivative acting on a 4-vector. In the presence of torsion, 


(Dz, Di] Am = Si, DnAm + Rkimn A” a tetrad tensor . (11.58) 


If torsion vanishes, as general relativity assumes, then the definition (11.58) reduces to 


(Dz, Di] Am = Reimn A” | a tetrad tensor . (11.59) 
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The expression (11.41) for the covariant derivative coupled with the torsion equation (11.48) yields the 
following formula for the tetrad-frame Riemann tensor in terms of tetrad connection, for the general case of 
non-vanishing torsion: 


Rkimn = kl mnt — OL mnk + Il pak — I Tpi + (Th — Th, E St Enp a tetrad tensor . (11.60) 


The formula has extra terms (T}%; — r7, — SP) mnp compared to the formula (2.112) for the coordinate-frame 
Riemann tensor R,;),,,. If torsion vanishes, as general relativity assumes, then 


Rkimn = OnE mnt — OF mnk +T? TP pnk — TPT pnt + FR, — P )Imnp| a tetrad tensor . (11.61) 


The symmetries of the tetrad-frame Riemann tensor are the same as those of the coordinate-frame Riemann 
tensor. For vanishing torsion, these are 


Rkimn = Rey {mn]) + (11.62a) 
Rkjiimn = 0. (11.62b) 


Exercise 11.3. Riemann tensor. From the definition (11.58), derive the expression (11.60) for the Rie- 
mann tensor. [Hint: Start by expanding out the definition (11.58) using the definition (11.42) of the covariant 
derivative. You will find it easier to derive an expression for the Riemann tensor with one index raised, such 
as Rem”, but you should resist the temptation to leave it there, because the symmetries of the Riemann 
tensor are obscured when one index is raised. To switch to all lowered indices, you will need to convert terms 
such as 0,1", by 


OnV mi = Oky T pmi) = VP On pmt +T pmi Ony”” - (11.63) 


You should show that the directed derivative 0,y"” in this expression is related to tetrad connections through 
a formula similar to equation (11.46), 


dky? =k — TP", , (11.64) 
which you should recognize as equivalent to Dyy"? = 0. To complete the derivation, show that 
Ox (Taint + Tami) — 3 Tmar + Tame) = (On: Oman = (Tip — Thi + Ski) map + Pam) - (11.65) 
Equation (11.65) implies the antisymmetry of Rkimn in mn.| 


Exercise 11.4. Antisymmetry of the Riemann tensor. Argue that the antisymmetry of Rkimn in mn, 
with or without torsion, can be deduced from 


0= [Dk, DilY¥mn = SE pen + Rkimpôh, F Rkinpôhn = Rkimn + Rkinm : (11.66) 


Exercise 11.5. Cyclic symmetry of the Riemann tensor. Show that the cyclic symmetry (11.62b) is 
a consequence of the assumption of vanishing torsion. 
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Solution. Use the Jacobi identity applied to a scalar, [Dj,, Dm, Dy]]® = 0. Show that if ® is a scalar, then 
2Di,.DiDmj® = [Dix Di] Dry? = (Rikim]” — She Simp) Dr ® + Sihr Dm Dn 
= Diz (Di, Ding] ® = (Die Siinj) Dn ® + Sik Dm DnF - (11.67) 
Consequently 
Reem” = Dik Sim] + Stet Smlp . (11.68) 
An equivalent expression in terms of the torsion-free covariant derivative D; and the contortion Kmnı is 


Birim)” = DSi + KE Stony - (11.69) 


Exercise 11.6. Symmetry of the Riemann tensor. Show that the cyclic symmetry (11.62b) implies the 
symmetry kl © mn, given the antisymmetries k 4+ l and m © n. Given Exercise 11.5, this shows that the 
symmetry kl 4 mn is, like the cyclic symmetry, a consequence of vanishing torsion. 

Solution. Show that 


2(Rkimn = Rmnki) =3 (Rklimn] _ Rifkmn] = Rminki] T Rnjmky) , (11.70) 


or alternatively, 


2(Rkimn = Rmnki) =3 (Rikim)n = Rikinjm = Rimnk]l T Rimnijk) . (11.71) 


Exercise 11.7. Number of components of the Riemann tensor. How many independent components 
does the Riemann tensor have, in 4-dimensional spacetime? 

Solution. If torsion vanishes, 20. If torsion does not vanish, 36. The extra 16 components come from Riktm]n, 
which is related to torsion by equation (11.68), and which has 4 x 4 = 16 components if torsion does not 
vanish. 


Concept question 11.8. Must connections vanish if Riemann vanishes? Must the tetrad connections 
Timn vanish if the Riemann tensor vanishes identically, Rkimn = 0? Answer. No. For a counterexample, take 
flat (Minkowski) space expressed in spherical polar coordinates {t,r,0,¢}. The non-vanishing tetrad-frame 
connections are P2132 = [313 = 1/r and T'323 = cot 0/r (compare equations (20.23)). 


11.16.1 Riemann tensor in a mixed coordinate-tetrad frame 


In Chapter 16, Einstein’s equations will be obtained from an action principle, as first done by Hilbert (1915). 
The Hilbert Lagrangian takes a particularly insightful form if the Riemann tensor is expressed in a mixed 
coordinate-tetrad basis. 

The coordinate-frame covariant derivative D,, of a tetrad-frame vector a,, is 


o 
"n — [7am a coordinate-tetrad tensor , (11.72) 


dur 


Den = e! , Dean = 
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where I". is the tetrad-frame connection with its last index converted into the coordinate frame with the 


NK 
vierbein, 


rm =e*.T™. a coordinate vector, but not a tetrad tensor . (11.73) 


As usual, the connection with all indices lowered is defined by Tink = Yim. The connections Fmns should 
not be confused with the coordinate-frame connections (Christoffel symbols) Tays- The relation between the 
two is, from equation (11.44), 

Cing => ee dmnk + Em" En” Twr š (11.74) 


In 4 dimensions there are 6 x 4 = 24 distinct connections Dyn, (with or without torsion), whereas there are 
4 x 10 = 40 distinct coordinate-frame connections T „ys (without torsion, or 4 x 4 x 4 = 64 with torsion). 
The last term on the right hand side of equation (11.60) for the Riemann tensor can be written, in view 
of equations (11.49) and (11.32), 
(Thi B The B Sii) mnp = (drek m Ope) mnr ‘ (11.75) 


The Riemann tensor R,.\mn in the mixed coordinate-tetrad basis is then 


OV mna OV mn 
Ri mn = 5 
i Ox" Ox 


+I? Pon — I2, lpna | a coordinate-tetrad tensor , (11.76) 


ME 


which is valid with or without torsion. Equation (11.76) resembles superficially the coordinate-frame expres- 
sion (2.112) for the Riemann tensor, but it is more economical in that there are only 24 connections Ilmne 
instead of the 40 (or 64, with torsion) coordinate-frame connections Tvs- 

The torsion S7% in the mixed coordinate-tetrad basis is 


mi __ ey Oe” k r™ if rm k di d 
See + aa ie s+ lee”, | a coordinate-tetrad tensor . (11.77) 


Equations (11.76) and (11.77) constitute Cartan’s equations of structure (Cartan, 1904) (see §16.14.2). 


11.17 Ricci, Einstein, Bianchi 


The usual suite of formulae leading to Einstein’s equations apply. Since all the quantities are tensors, and 
all the equations are tensor equations, their form follows immediately from their coordinate counterparts. 
Ricci tensor: 


Rkm = Y” Rkimn . (11.78 


Ricci scalar: 


Rea” Rem |. (11.79 


Einstein tensor: 
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Einstein’s equations: 


Grm = 8GTkm | - (11.81) 


The trace of the Einstein equations implies that R = —8rGT, so the Einstein equations (11.81) can equally 
well be written with the trace terms transferred from the left to the right hand side, 


Rim = 81G (Tem — $YkmT) - (11.82) 


Bianchi identities in the absence of torsion: 


Dy Rimnp F Di Rmknp T DmRkinp =0 (11.83) 


which most importantly imply covariant conservation of the Einstein tensor, hence conservation of energy- 
momentum 


DT = 0) (11.84) 


11.18 Expressions with torsion 


If torsion does not vanish, then the Riemann tensor, and consequently also the Ricci and Einstein tensors, 
can be split into torsion-free (distinguished by a ° overscript) and torsion parts (e.g. Hehl, Heyde, and Kerlick 
1976). A similar split occurs in the ADM formalism where a certain gauge choice (fixing the time component 
‘Yo of the tetrad to be orthogonal to hypersurfaces of constant time) splits the tetrad connection into a tensor 
part, the extrinsic curvature, and a remainder, equation (17.27). 

The contortion tensor Kımn was defined previously as the torsion part of the connection Tymn, equa- 
tion (11.55). The unique non-vanishing contraction of the contortion tensor defines the contortion vector 
Km, 


Km = Kin = Sin - (11.85) 
The torsion-full Riemann tensor Rkimn is a sum of the torsion-free Riemann tensor Pee. and a torsion 


part (note that Kpri — Kpik — Spki = 0, so the “extra” term in Reimn, equation (11.60), vanishes when K px: 
is the contortion), 


Rkimn = Rrimn =F Dk Kmni = D,Kmnk oF K? Ron = K? pKpni : (11.86 
The Ricci tensor is 
Rem = Rim — Dr Km — D” Kink + Kmp K” — KmpkK? , (11.87 


and the Ricci scalar is 


R= R-2D,K" + Kmm K?” — K K” . (11.88 
The antisymmetric part of the Einstein tensor is, from contracting equation (11.68), 


Gem) = Riem] = $Rikim] = $ (Die Stn F P) ; (11.89) 
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which vanishes for vanishing torsion. 
The Jacobi identity (2.126) implies, in addition to the 16 conditions (11.68), the 24 Bianchi identities 


Dik Rimjnp + Stat mlanp =0. (11.90) 
The doubly-contracted Bianchi identities are 


— 34k! (DiERiminp + Saia) = D" Gmk — 49} Rma” — Sfm R = 0. (11.91) 


11.19 General relativity in 2 spacetime dimensions 


General relativity in 2 spacetime dimensions is weird. There are zero Bianchi identities (2.128) in 2 spacetime 
dimensions, so the Bianchi identities do not identify any covariantly conserved tensor. The Einstein tensor 
itself vanishes identically in 2 spacetime dimensions. 

There are consistent extensions of general relativity in 2 spacetime dimensions, such as string-inspired 
dilaton gravity (Grumiller, Kummer, and Vassilevich, 2002). However, those will not be considered here. 

Historically, the main application of 2-dimensional relativity has been to explore quantum field theory 
in curved spacetime, since in 2 spacetime dimensions the quantum energy-momentum tensor induced by 
any prescribed geometry can be calculated exactly (even though the classical energy-momentum tensor is 
indeterminate). 

The closest thing to a consistent realisation of classical general relativity in 2 spacetime dimensions is as 
follows. 

In 2 spacetime dimensions, the Riemann tensor has just one distinct component, Ro101, and that component 
is determined entirely by the Ricci scalar R. The tetrad-frame Riemann and Ricci tensors are related to the 
Ricci scalar R by 


Rkimn = 4 (YkmYin = YenVim) R > Rkm = iykmR . (11.92) 


In an arbitrary number of N spacetime dimensions, contracting the Einstein equations implies that the Ricci 
scalar R is proportional to the trace T of the energy-momentum tensor, 


(1-$N)R=kyT, (11.93) 


where ky is Newton’s gravitational constant in N spacetime dimensions, suitably normalized. For N = 2, 
the factor on the left of equation (11.93) vanishes; but one can imagine absorbing the zero factor into a 
redefinition of the gravitational constant ky, so that 


R=5T (11.94) 
for some «5. Now impose that the energy-momentum tensor Tkm is covariantly conserved, 
D Tim = 0. (11.95) 


In N = 2 spacetime dimensions, the trace relation (11.94) together with the covariant conservation condi- 
tion (11.95) imply almost uniquely the form of the energy-momentum tensor. 
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The conserved energy-momentum tensor Tkm takes its simplest expression when the metric is expressed in 
conformally flat form. The metric in N = 2 spacetime dimensions is a symmetric 2 x 2 matrix. By a suitable 
coordinate transformation of the 2 coordinates, the metric can be brought to the conformally flat form 


ds” = e% (— dt? + dx”) = —e*dudu , (11.96) 


where v = t+ x and u = t — zx are null coordinates, and € is a function of the two coordinates. The 
Newman-Penrose tetrad-frame components of the conserved energy-momentum tensor Tgm are then 


R —2¢ GHA , j 
=a —4e Joð lg = Kolvu , (11.97a) 
oe o£ ; + 
=2F | > tf EA re — jel 
4e E (ž) fT (v)] = K3To , (11.97b) 
PE o£ ? 
—2€ EON 1 
4e E = (5) fF (u)| = KoTun , (11.97c) 


where f*(v) and f~(u) are arbitrary functions of respectively v and u. There is a residual gauge freedom 
v + V(v) and u > U (u) in the choice of null coordinates that allows the conformal function to be adjusted 
E > €4 ¿t (v) + ¿~ (u) by arbitrary additive functions of v and u. This residual gauge freedom allows the 
functions f*(v) and f~ (u) in equations (11.97b) and (11.97c) to be adjusted arbitrarily. If desired, f*(v) 
and f~ (u) can be set to zero. 

The classical 2-dimensional general relativity described by equations (11.97) is not very interesting; for 
example there is no 2-dimensional analogue of the Schwarzschild black hole, Exercise 11.9. 

Where equations (11.97) prove more interesting is that they also describe the expectation value (Tj) of the 
renormalized quantum energy-momentum induced by a given geometry in 2 spacetime dimensions (Birrell 
and Davies, 1982). That is, the expectation value (T) of the quantum trace in 2 spacetime dimensions is 
proportional to the Ricci scalar (Birrell and Davies, 1982, eq. (6.121)), and the quantum energy-momentum 
tensor (Tp) is covariantly conserved, therefore equations (11.97) are satisfied by (Tp). In 4 spacetime dimen- 
sions the quantum energy-momentum tensor (T;,;) is extremely difficult to calculate in a general spacetime, 
so clues from 2 spacetime dimensions can be illuminating. 


Exercise 11.9. Black holes in 2 spacetime dimensions? Does the analogue of a Schwarzschild black 
hole exist in 2 spacetime dimensions? 

Solution. No. Require the spacetime to be empty outside some radius. The vanishing of the Ricci scalar (11.97a) 
implies that 


€= ET (v) +E (u) (11.98) 


for some functions €* and €~ of the null coordinates v and u. But then the coordinate tranformations of the 
null coordinates 


dV =e duv, dU =e® “dy (11.99) 
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bring the line-element to 

ds? = —dVdU , (11.100) 
which is just flat (Minkowski) space in N = 2 dimensions. 


Exercise 11.10. Tidal forces falling into a Schwarzschild black hole. In the Schwarzschild or 
Gullstrand-Painlevé orthonormal tetrad, or indeed in any orthonormal tetrad of the Schwarzschild geome- 


try where t and r represent the time and radial directions and 6 and ¢ represent the transverse (angular 
directions, the non-zero components of the tetrad-frame Riemann tensor are 
$ Rirtr = — Roto = —Rigte = Rroro = Rrorg = —5Rogod =C , (11.101 
where 
C= —M/r (11.102 


is the Weyl scalar (the spin-0 component of the Weyl tensor). 
1. Tidal forces. A person at rest in the tetrad has, by definition, tetrad-frame 4-velocity u™ = {1,0, 0,0}. 
From the equation of geodesic deviation, equation (11.103), 


Dilin 
Dr? 
deduce the tidal acceleration on the person in the radial and transverse directions. Does the tidal 
acceleration stretch or compress? [Hint: The equation of geodesic deviation, §3.3, gives the proper 
acceleration between two points a small distance d€™ apart, where €™ are the locally inertial coordinates 
of the tetrad frame. Notice that this problem is much easier to solve with tetrads than with the traditional 
coordinate approach. Note also that since the Weyl tensor takes the same form (11.101) independent of 
the radial boost, the tidal acceleration is the same regardless of the radial velocity of the infaller.| 
2. Choose a black hole to fall into. What is the mass of the black hole for which the tidal acceleration 
M/r? is 1 gee per metre at the horizon? If you wanted to fall through the horizon of a black hole without 
first being torn apart, what mass of black hole would you choose? [Hint: 1 gee is the gravitational 
acceleration at the surface of the Earth.| 


+ Rrimndé*ulu” = 0, (11.103) 


3. Time to die. In a previous problem you showed that the proper time to free-fall radially from radius 
r to the singularity of a Schwarzschild black hole, for a faller who starts at zero velocity at infinity (so 
E = 1), is 


2 3 
T eae (11.104) 


How long, in seconds, does it take to fall to the singularity from the place where the tidal acceleration 
is 1 gee per metre? Comment? 
4. Tear-apart radius. At what radius r, in km, do you start to get torn apart, if that happens when the 


tidal acceleration is 1 gee per metre? Express your answer in terms of the black hole mass M in units 
of a solar mass Mo, that is, in the form r =? (M/Mo)’. 
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5. Spaghettified? In Exercise 7.6 you showed that the infall velocity of a person who free-falls radially 


from zero velocity at infinity (so E= 1) is 
. è 


Show that radial component (ô£”) of the equation of geodesic deviation (11.103) for such a person solves 
to 


6g" = = + Br? , (11.106) 
r 


F 


where A and B are constants. If a person tears apart when the tidal acceleration is 1 gee per metre, and 
the parts of the person free-fall thereafter, is the person actually spaghettified? [Hint: If the frame is 
in free-fall, then the covariant derivatives D/Dr in the equation of geodesic deviation may be replaced 
by ordinary derivatives d/dr in that frame. The last part of the question — Is the person actually 
spaghettified? — is a concept question: given the solution (11.106), can you interpret what it means?| 


Exercise 11.11. Totally antisymmetric tensor. 
1. In an orthonormal tetrad ym where yo points to the future and 1, Y2, y3 are right-handed, the 
contravariant totally antisymmetric tensor e*!’""” is defined by (this is the opposite sign from the Misner, 
Thorne, and Wheeler (1973) notation) 


get = [klmn] , (11.107) 
where [klmn] is the totally antisymmetric symbol 


+1 if klmn is an even permutation of 0123 , 
[klmn] = 4 —1 if klmn is an odd permutation of 0123 , (11.108) 
0 if klmn are not all different . 


The choice of + sign in the definition (11.107) of <*’”" is determined by the definition (13.19) of 
the pseudoscalar Iy of the geometric algebra in N dimensions as a product of all N basis vectors, 
equation (15.74). The corresponding covariant totally antisymmetric tensor €x1mn is 


Eklmn = —|klmn| 5 (11.109) 


in which the — sign is the determinant of the tetrad (Minkowski) metric. Argue that in a general basis 
e,, the contravariant totally antisymmetric tensor enn ig 


EAM = epel Dem! en” Eh = e7! [kun] , (11.110) 
while its covariant counterpart is 


Erauv = —€ [KAM] , (11.111) 


where e = |e™ „| is the determinant of the vierbein. 
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2. Show that in 4 dimensions 


klmn 


E Ekàuv = —4! elk 


Conclude that 


k klmn 
ek E Ex ALY 

ko An nklmn 
ek €l E€ Ekv 


KLA klmn 

ek €l Em E€ Ek\uv 
Ko À v ~klmn 

€k €l Em” En E Ekv 


The coefficient of the p’th contraction is —p!(4 — p)!. 


F om on 
KE AC pe J $ 


= —6 el e™ e"l, ; 
=—4el™ e", , 

n 
=-6e",, 


= —24. 


(11.112 


(11.113a 
(11.113b 
(11.113c 
(11.113d 
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Spin and Newman-Penrose tetrads 


THIS CHAPTER NEEDS REWRITING. 

This Chapter discusses spin tetrads (§??) and Newman-Penrose tetrads (§12.2). The Chapter goes on 
to show how the fields that describe electromagnetic (§??) and gravitational (§12.3) waves have a natural 
and insightful complex structure that is brought out in a Newman-Penrose tetrad. The Newman-Penrose 
formalism provides a natural context for the Petrov classification of the Weyl tensor (§12.4). 


12.1 Spin tetrad formalism 


In quantum mechanics, fundamental particles have spin. The 3 generations of leptons (electrons, muons, 
tauons, and their respective neutrino partners) and quarks (up, charm, top, and their down, strange, and 
bottom partners) have spin 4 (in units A = 1). The carrier particles of the electromagnetic force (photons), 
the weak force (the W= and Z bosons), and the colour force (the 8 gluons), have spin 1. The carrier of the 
gravitational force, the graviton, is expected to have spin 2, though as of 2010 no gravitational wave, let 


alone its quantum, the graviton, has been detected. 
General relativity is a classical, not quantum, theory. Nevertheless the spin properties of classical waves, 
such as electromagnetic or gravitational waves, are already apparent classically. 


12.1.1 Spin tetrad 


A systematic way to project objects into spin components is to work in a spin tetrad. As will become apparent 
below, equation (12.5), spin describes how an object transforms under rotation about some preferred axis. In 
the case of an electromagnetic or gravitational wave, the natural preferred axis is the direction of propagation 
of the wave. With respect to the direction of propagation, electromagnetic waves prove to have two possible 


spins, or helicities, +1, while gravitational waves have two possible spins, or helicities, +2. A preferred axis 
might also be set by an experimenter who chooses to measure spin along some particular direction. The 
following treatment takes the preferred direction to lie along the z-axis yz, but there is no loss of generality 
in making this choice. 
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Start with an orthonormal tetrad {%,Yz2,Yy,yz}. If the preferred tetrad axis is the z-axis yz, then the 
spin tetrad axes {7,,-y_} are defined to be complex combinations of the transverse axes {7/z, Yy}, 


Vt = ye + iy) |, (12.1a) 


Y- = Ae- iw)|. (12.1b) 


(12.2) 


e OOO 
O.o © 


Notice that the spin axes {y4, y- } are themselves null, y4 -Y+ = Y-Y- = 0, whereas their scalar product 
with each other is non-zero y4 : y- = 1. The null character of the spin axes is what makes spin especially 
well-suited to describing fields, such as electromagnetism and gravity, that propagate at the speed of light. An 
even better trick in dealing with fields that propagate at the speed of light is to work in a Newman-Penrose 
tetrad, §12.2, in which all 4 tetrad axes are taken to be null. 


12.1.2 Transformation of spin under rotation about the preferred axis 


Under a right-handed rotation by angle x about the preferred axis yz, the transverse axes Yr and Yyy transform 
as 


Ya —> COS X Ya + SIN X Yy , 
Yy — SİN X Yr — COS X Yy - (12.3) 


It follows that the spin axes y, and -y_ transform under a right-handed rotation by angle x about ~y, as 


ya > eX ye (12.4) 


The transformation (12.4) identifies the spin axes y+ and -y_ as having spin +1 and —1 respectively. 


12.1.3 Spin 
More generally, an object can be defined as having spin s if it varies by 
e 3x (12.5) 


under a right-handed rotation by angle x about the preferred axis y+. Thus an object of spin s is unchanged 
by a rotation of 27/s about the preferred axis. A spin-0 object is symmetric about the ~y, axis, unchanged 
by a rotation of any angle about the axis. The ~y, axis itself is spin-O, as is the time axis yz. 

The components of a tensor in a spin tetrad inherit spin properties from that of the spin basis. The general 
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rule is that the spin s of any tensor component is equal to the number of + covariant indices minus the 
number of — covariant indices: 


spin s = number of + minus — covariant indices | . (12.6) 


12.1.4 Spin flip 


Under a reflection through the y-axis, the spin axes swap: 
Verges (12.7) 


which may also be accomplished by complex conjugation. Reflection through the y-axis, or equivalently 
complex conjugation, changes the sign of all spin indices of a tensor component 


TON (12.8) 


In short, complex conjugation flips spin, a pretty feature of the spin formalism. 


12.1.5 Spin versus spherical harmonics 


In physical problems, such as in cosmological perturbations, or in perturbations of spherical black holes, or 
in the hydrogen atom, spin often appears in conjunction with an expansion in spherical harmonics. Spin 
should not be confused with spherical harmonics. 

Spin and spherical harmonics appear together whenever the problem at hand has a symmetry under the 3D 
special orthogonal group SO(3) of spatial rotations (special means of unit determinant; the full orthogonal 
group O(3) contains in addition the discrete transformation corresponding to reflection of one of the axes, 
which flips the sign of the determinant). Rotations in SO(3) are described by 3 Euler angles {0, ¢, x}. Spin 
is associated with the Euler angle x. The usual spherical harmonics Y¢,,,(0,) are the spin-0 eigenfunctions 
of SO(3). The eigenfunctions of the full SO(3) group are the spin harmonics SIGN? 


sYem(9, Q, x) = Orms (0, Q, xet etx “ (12.9) 


12.1.6 Spin components of the Einstein tensor 


With respect to a spin tetrad, the components of the Einstein tensor Gmn are 


Gu Gi Gip Gi- 


A ee E (12.10) 
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From this it is apparent that the 10 components of the Einstein tensor decompose into 4 spin-0 components, 


4 spin-+1 components, and 2 spin-+2 components: 


—2: G__, 
—1: Gi— ; Gz— > 
0: Gi 5 Giz 5 Giz 5 Gy ; (12.11) 
+1: Giy, Get, 
+2: G44 . 
The 4 spin-0 components are all real; in particular G4- is real since G} _ = G-+ = Gy_. The 4 spin-+1 
and 2 spin-+2 components comprise 3 complex components 
Gl, = Gy Gir = Gia Gop = Gz: (12.12) 


In some contexts, for example in cosmological perturbation theory, REALLY? the various spin components 
are commonly referred to as scalar (spin-0), vector (spin-+1), and tensor (spin-+2). 


12.2 Newman-Penrose tetrad formalism 


The Newman-Penrose formalism (Newman and Penrose, 1962; Newman and Penrose, 2009) provides a partic- 
ularly powerful way to deal with fields that propagate at the speed of light. The Newman-Penrose formalism 
adopts a tetrad in which the two axes y, (outgoing) and Y, (ingoing) along the direction of propagation are 
chosen to be lightlike, while the two axes y} and -y_ transverse to the direction of propagation are chosen 
to be spin axes. 

Sadly, the literature on the Newman-Penrose formalism is characterized by an arcane and random notation 
whose principal purpose seems to be to perpetuate exclusivity for an old-boys club of people who understand 
it. This is unfortunate given the intrinsic power of the formalism. Held (1974) comments that the Newman- 
Penrose formalism presents “a formidable notational barrier to the uninitiate.” For example, the tetrad 
connections [kmn are called “spin coefficients,’ and assigned individual greek letters that obscure their 
transformation properties. Do not be fooled: all the standard tetrad formalism presented in Chapter 11 
carries through unaltered. One ill-born child of the notation that persists in widespread use is W2_, for the 
spin s component of the Weyl tensor, equations (12.30). 

Gravitational waves are commonly characterized by the Newman-Penrose (NP) components of the Weyl 
tensor. The NP components of the Weyl tensor are sometimes referred to as the NP scalars. The designation 
as NP scalars is potentially misleading, because the NP components of the Weyl tensor form a tetrad-frame 
tensor, not a set of scalars (though of course the tetrad-frame Wey] tensor is, like any tetrad-frame quantity, a 
coordinate scalar). The NP components do become proper quantities, and in that sense scalars, when referred 
to the frame of a particular observer, such as a gravitational wave telescope, observing along a particular 
direction. However, the use of the word scalar to describe the components of a tensor is unfortunate. 
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12.2.1 Newman-Penrose tetrad 


A Newman-Penrose tetrad {Y%v, Yu, Y+, Y- } is defined in terms of an orthonormal tetrad {%,Yz2,Yy,Yz} by 


Ww = aM +y) (12.13a) 
Wu = AM- Y), (12.13b) 
Y+ = gsti), (12.13¢) 
Y- = Z- ty) |, (12.13d) 
or in matrix form 
Yo 1 0 0 1 Vt 
1 Z 
AE o Poe i (12.14) 
Y+ V2 0 1 a 0 Vy 
y= 0 1 — 0 Yz 
All four tetrad axes are null 
WwW = VuVu =V V+ =V- SO (12.15) 


In a profound sense, the null, or lightlike, character of each the four NP axes explains why the NP formalism is 
well adapted to treating fields that propagate at the speed of light. The tetrad metric of the Newman-Penrose 
tetrad {Yv, Yu, Y+, Y- } is 


0 -1 0 0 
-1 0 0 0 

Ymn = 0 0 01 (12.16) 
0 0 1 0 


12.2.2 Boost weight 


A boost by rapidity 0 along the y, axis multiplies the outgoing and ingoing axes y, and y, by a blueshift 
factor e? and its reciprocal, 


Yv z? e? Yu $ 
Yu > e Yu. (12.17) 


In terms of the velocity v = tanh 0, the blueshift factor is the special relativistic Doppler shift factor 


Ley\? 

6 

= : 12.1 
e =) ( 8) 
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More generally, object is said to have boost weight n if it varies by 
e’ (12.19) 


under a boost by rapidity 0 along the preferred direction y+. Thus -y, has boost weight +1, and Yu has 
boost weight —1. The spin axes y+ both have boost weight 0. The NP components of a tensor inherit their 
boost weight properties from those of the NP basis. The general rule is that the boost weight n of any tensor 
component is equal to the number of v covariant indices minus the number of u covariant indices: 


boost weight n = number of v minus u covariant indices | . (12.20) 


12.2.3 Lorentz transformations 


Under a Lorentz transformation consisting of a combination of a Lorentz boost by rapidity € about t-x and 
a rotation by angle Ç about y-z, an orthonormal tetrad Ym = {¥t, Yz, Yy, Yz} transforms as FIX SIGNS 


cosh(€) —sinh(€) 0 0 vt 
A —sinh() — cosh(€) 0 0 Yr 
m = ; 12.21 
Ym > Ym 0 0 cos(¢) sin(Ç) Vy ( ) 
0 0 —sin(¢) cos(¢) Yz 
Under the same Lorentz transformation, the bivector axes Yim = {Yix,Yey, Ytz} transform as 
1 0 0 Ytx 
Yim > Vim = | 0 cos(¢+7€) — sin(¢ + 7) Yy |- (12.22) 
0 —sin(¢+i€) cos(¢ + i£) Yiz 
12.3 Weyl tensor 
The Weyl tensor is the trace-free part of the Riemann tensor, 
Crimn = Rkimn = 4 (Ykm Rin = Ykn Pim + Yin Rkm <2 VimRkn) + FA (Yem Yin = YenVim) R . (12.23) 


By construction, the Weyl tensor vanishes when contracted on any pair of indices. Whereas the Ricci and 
Einstein tensors vanish identically in any region of spacetime containing no energy-momentum, Tmn = 0, 
the Weyl tensor can be non-vanishing. Physically, the Weyl tensor describes tidal forces and gravitational 
waves. 


12.3.1 Complexified Weyl tensor 


The Weyl tensor is is, like the Riemann tensor, a symmetric matrix of bivectors. Just as the electromagnetic 
bivector Fy, has a natural complex structure, so also the Weyl tensor Ckimn has a natural complex structure. 
The properties of the Weyl tensor emerge most plainly when that complex structure is made manifest. 
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In an orthonormal tetrad {Y+, Yx, Yy, Yz}, the Weyl tensor Ckimn can be written as a 6 x 6 symmetric 
bivector matrix, organized as a 2 x 2 matrix of 3 x 3 blocks, with the structure 


Crete Crety Crxtz Crezy Creaz Craya 

CEE CEB Crete aa mae nei ave ia 
C = = 5 
Corztx 
Cyxta 


(12.24) 


where E denotes electric indices, B magnetic indices, per the designation (??). The condition of being 
symmetric implies that the 3 x 3 blocks Ceg and Cgpg are symmetric, while Cag = Chp The cyclic 
symmetry (11.62b) of the Riemann, hence Weyl, tensor implies that the off-diagonal 3 x 3 block Cgg (and 
likewise Cpg) is traceless. 


The natural complex structure motivates defining a complexified Weyl tensor O kirih by 


. 1 
Crimn = ri (star + seu) (si. + Semn") Cpgrs a tetrad tensor (12.25) 


analogously to the definition (??) of the complexified electromagnetic field. The definition (12.25) of the 
complexified Weyl tensor Chimn is valid in any frame, not just an orthonormal frame. In an orthonormal 
frame, if the Weyl tensor Ckimn is organized according to the structure (12.24), then the complexified Weyl 
tensor Ckimn defined by equation (12.25) has the structure 


~ 1 1 —i 
G= 1 ( ; ý ) (CEE — Cgg +i CEB +iCpge) . (12.26) 


Thus the independent components of the complexified Weyl tensor Ĉkimn constitute a 3 x 3 complex sym- 
metric traceless matrix Cge — CBB +i(Czp + Cpe), with 5 complex degrees of freedom. Although the 
complexified Weyl tensor Cimn is defined, equation (12.25), as a projection of the Weyl tensor, it neverthe- 
less retains all the 10 degrees of freedom of the original Weyl tensor Ckimn- 

The same complexification projection operator applied to the trace (Ricci) parts of the Riemann tensor 
yields only the Ricci scalar multiplied by that unique combination of the tetrad metric that has the sym- 
metries of the Riemann tensor. Thus complexifying the trace parts of the Riemann tensor produces nothing 
useful. 
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12.3.2 Newman-Penrose components of the Weyl tensor 


With respect to a NP null tetrad {%v, Yu, Y+, Y- }, equation (39.1), the Weyl tensor Ckimn has 5 distinct 


complex components, here denoted w,, of spins respectively s = —2, —1, 0, +1, and +2: 
—2: w_2 = Cuu 5 
=l; w-1 = Cuvu = C4 u— > 
0: Wo — 4 (Crete + Cuv+—) = $ (Cy_-4- + Cuv+—) = v+—u > (12.27) 
+1: V1 = Uvuv+ = C++ ; 
+2 : we = vtu+ °: 


The complex conjugates wt of the 5 NP components of the Weyl tensor are: 


P*a — Vutut ; 

wry = Cuvu+ = Claut ’ 
o = 5 (Cuvuv + Cag) T 5 (C_4—4 + Cuv-+) = Uv—+u ; (12.28) 
i -= Couv— = (OENE 5 
2 = Cy» 


whose spins have the opposite sign, in accordance with the rule (12.8) that complex conjugation flips spin. 
The above expressions (12.27) and (12.28) account for all the NP components Ckimn of the Weyl tensor but 
four, which vanish identically: 


Cog SO SU aa = Co-u- =0. (12.29) 


The above convention that the index s on the NP component Ys labels its spin differs from the standard 
convention, where the spin s component of the Weyl tensor is impenetrably denoted w2_, (e.g. Chandrasekhar 
(1983)): 


—2: Wa ; 
—1: %3, 
0: %2, (standard convention, not followed here) (12.30) 
+1: wy š 
+2: Wo . 


12.3.3 Newman-Penrose components of the complexified Weyl tensor 


The non-vanishing NP components of the complexified Weyl tensor Crtmn defined by equation (12.25) are 


z Oia = we , 
Cun = C1 u = p 1; 
Canin = Cyto = Crug- =Cy-u = Wo , (12.31) 


Ciut = C 3yr = pi 3 
Coot = pa : 
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whereas any component with either of its two bivector indices equal to v— or u+ vanishes. As with the 
complexified electromagnetic field, the rule that complex conjugation flips spin fails here because the com- 
plexification operator breaks the rule. Equations (12.31) show that the complexified Weyl tensor in an NP 
tetrad contains just 5 distinct non-vanishing complex components, and those components are precisely equal 
to the complex spin components Ys- 

With respect to a triple of bivector indices ordered as {u—, wv, +v}, the NP components of the complexified 
Weyl tensor constitute the 3 x 3 complex symmetric matrix 


l w2 V-1 Vo 
Crimn = | Y-1 Yo A é (12.32) 
po Yr yo 


12.3.4 Components of the complexified Weyl tensor in an orthonormal tetrad 


The complexified Weyl tensor forms a 3 x 3 complex symmetric traceless matrix in any frame, not just an 
NP frame. In an orthonormal frame, with respect to a triple of bivector indices {ta, ty, tz}, the complexified 
Weyl tensor Ckimn can be expressed in terms of the NP spin components Ys as 


- Yo 53(~1 — Y-1) —$3(1 + b-1) 
Ckimn = 3 (V1 = w_1) -3V0 + (wo + pa) -i _ wa) . (12.33) 
—5 (%1 esi) —3 (Y2 — p_2) — 4yo — (Y2 + Y-2) 


12.3.5 Propagating components of gravitational waves 


For outgoing gravitational waves, only the spin —2 component %_s (the one conventionally called %4) prop- 
agates, carrying gravitational waves from a source to infinity: 


w_2 : propagating, outgoing . (12.34) 


This propagating, outgoing —2 component has spin —2, but its complex conjugate has spin +2, so effectively 
both spin components, or helicities, or circular polarizations, of an outgoing gravitational wave are embodied 
in the single complex component. The remaining 4 complex NP components (spins —1 to 2) of an outgoing 
gravitational wave are short range, describing the gravitational field near the source. 

Similarly, only the spin +2 component pə of an ingoing gravitational wave propagates, carrying energy 
from infinity: 


wo: propagating, ingoing . (12.35) 


12.4 Petrov classification of the Weyl tensor 


As seen above, the complexified Weyl tensor is a complex symmetric traceless 3 x 3 matrix. If the matrix 
were real symmetric (or complex Hermitian), then standard mathematical theorems would guarantee that 


324 Spin and Newman-Penrose tetrads 


Table 12.1: Petrov classification of the Weyl tensor 


Petrov Distinct Distinct Normal form 
type eigenvalues eigenvectors of the complexified Weyl tensor 
Wo 0 0 
I 3 3 0 = iyo + iyo 0 
0 0 — $to — 4yo 
Wo 0 0 
D 2 3 0 iyo 0 
0 0 -~F0 
Wo 0 0 
1 2 2 0 -jvothve -ie 
o =ive -iyo iye 
0 0 0 
O 1 3 000 
0 0 0 
0 0 0 
N 1 2 0 lys —ip 
0 =i? -iy 
0 iyi -4y 
III 1 1 iya 0 0 
-pj 0 0 


it would be diagonalizable, with a complete set of eigenvalues and eigenvectors. But the Weyl matrix is 
complex symmetric, and there is no such theorem. 

The mathematical theorems state that a matrix is diagonalizable if and only if it has a complete set of 
linearly independent eigenvectors. Since there is always at least one distinct linearly independent eigenvector 
associated with each distinct eigenvalue, if all eigenvalues are distinct, then necessarily there is a complete 
set of eigenvectors, and the Weyl tensor is diagonalizable. However, if some of the eigenvalues coincide, then 
there may not be a complete set of linearly independent eigenvectors, in which case the Weyl tensor is not 
diagonalizable. 

The Petrov classification, tabulated in Table 12.1, classifies the Weyl tensor in accordance with the number 
of distinct eigenvalues and eigenvectors. The normal form is with respect to an orthonormal frame aligned 
with the eigenvectors to the extent possible. The tetrad with respect to which the complexified Weyl tensor 
takes its normal form is called the Weyl principal tetrad. The Weyl principal tetrad is unique except in 
cases D, O, and N. For Types D and N, the Weyl principal tetrad is unique up to Lorentz transformations 
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that leave the eigen-bivector y+ unchanged, which is to say, transformations generated by the Lorentz rotor 
exp(Cytz) where Ç is complex. 

The Kerr-Newman geometry is Type D. General spherically symmetric geometries are Type D. The 
Friedmann-Lemaitre-Robertson-Walker geometry is Type O. Plane gravitational waves are Type N. 


13 


The geometric algebra 


The geometric algebra is a conceptually appealing and mathematically powerful formalism. If you want to 
understand rotations, Lorentz transformations, spin-4 particles, and supersymmetry, and you want to do 
actual calculations elegantly and (relatively) easily, then the geometric algebra is the thing to learn. 

The extension of the geometric algebra to Minkowksi space is called the spacetime algebra, which is 
the subject of Chapter 14. The natural extensions of the geometric and spacetime algebras to spinors are 
called the super geometric algebra and the super spacetime algebra, covered in Chapters 38 and 39. All 
these algebras may be referred to collectively as geometric algebras. I am generally unenthusiastic about 
mathematical formalism for its own sake. The geometric algebras are a mathematical language that Nature 
appears to speak. 

The geometric algebra builds on a broad mathematical heritage beginning with the work of Grassmann 
(1862; 1877) and Clifford (1878). The exposition in this book owes much to the conceptual rethinking of the 
subject by David Hestenes (Hestenes, 1966; Hestenes and Sobczyk, 1987). 

This Chapter starts by setting up the geometric algebra in N-dimensional Euclidean space RN, then 
specializes to the cases of 2 and 3 dimensions. The generalization to 4-dimensional Minkowski space, where 
the geometric algebra is called the spacetime algebra, is deferred to Chapter 14. The 4-dimensional spacetime 
algebra proves to be identical to the Clifford algebra of the Dirac 7-matrices, which explains the adoption 
of the symbol Ym to denote the basis vectors of a tetrad. Although the formalism is presented initially 
in Euclidean or Minkowski space, everything generalizes immediately to general relativity, where the basis 
vectors Ym form the basis of an orthonormal tetrad at each point of spacetime. 

This book follows the standard physics convention that a rotor R rotates a multivector a as a > RaR 
and a spinor y as y + Ry. This, along with the standard definition (13.19) for the pseudoscalar, has the 
consequence that a right-handed rotation corresponds to R = e~‘®/? with @ increasing, and that rotations 
accumulate to the left, that is, a rotation R followed by a rotation S is the product SR. The physics 
convention is opposite to that adopted in OpenGL and by the computer graphics industry, where a right- 
handed rotation corresponds to R = e*®/?, and rotations accumulate to the right, that is, R followed by S is 
RS. 

In this book, a multivector is written in boldface. A rotor is written in normal (not bold) face as a reminder 
that, even though a rotor is an even member of the geometric algebra, it can also be regarded as a spin-4 
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= 


J> a a 


Figure 13.1 Multivectors of grade 1, 2, and 3: a vector a (left), a bivector a ^b (middle), and a trivector aAbAc 
(right). 


object with a transformation law (13.75) different from that (13.56) of multivectors. Earlier latin indices 
a,b, ... run over spatial indices 1,2,... only, while mid latin indices m,n, ... run over both time and space 
indices 0, 1, 2,.... 


13.1 Products of vectors 


In 3-dimensional Euclidean space RÌ, there are two familiar ways of taking the product of two vectors, the 
scalar product and the vector product. 

1. The scalar product a- b, also known as the dot product or inner product, of two vectors a and b is 
a scalar of magnitude |a| |b| cos 0, where |a| and |b| are the lengths of the two vectors, and 0 the angle 
between them. The scalar product is commutative, a -b = b- a. 

2. The vector product, a x b, also known as the cross product, is a vector of magnitude |a] |b| sin 9, 
directed perpendicular to both a and b, such that a, b, and a x b form a right-handed set. The vector 
product is anticommutative, a x b = —b x a. 

The definition of the scalar product continues to work fine in a Euclidean space of any dimension, but 
the definition of the vector product works only in three dimensions, because in two dimensions there is no 
vector perpendicular to two vectors, and in four or more dimensions there are many vectors perpendicular 
to two vectors. It is therefore useful to define a more general version, the outer product (Grassmann, 1862) 
that works in Euclidean space R” of any dimension. 

3. The outer product a ^b, also known as the wedge product or exterior product, of two vectors a and 
b is a bivector, a multivector of dimension 2, or grade 2. The bivector a^ b is the directed 2-dimen- 
sional area, of magnitude |a| |b| sin 8, of the parallelogram formed by the vectors a and b, as illustrated 
in Figure 13.1. The bivector has an orientation, or handedness, defined by circulating the parallelogram 
first along a, then along b. The outer product is anticommutative, a Ab = —b/a, like its forebear the 
vector product. 
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The outer product can be repeated, so that (aA b) Ac is a trivector, a directed volume, a multivector 
of grade 3. The magnitude of the trivector is the volume of the parallelepiped defined by the vectors a, b, 
and c, illustrated in Figure 13.1. The outer product is by construction associative, (a Ab) \c=aA(bAc). 
Associativity, together with anticommutativity of bivectors, implies that the trivector aA b/c is totally 
antisymmetric under permutations of the three vectors, that is, it is unchanged under even permutations, and 
changes sign under odd permutations. The ordering of an outer product thus defines one of two handednesses. 

It is a familiar concept that a vector a can be regarded as a geometric object, a directed length, independent 
of the coordinates used to describe it. The components of a vector change when the reference frame changes, 
but the vector itself remains the same physical thing. In the same way, a bivector a ^ b is a directed area, 
and a trivector a Ab ^c is a directed volume, both geometric objects with a physical meaning independent 
of the coordinate system. 

In two dimensions the triple outer product of any three vectors is zero, a Ab A c = 0, because the volume 
of a parallelepiped confined to a plane is zero. More generally, in N-dimensional space RY, the outer product 
of N + 1 vectors is zero 


a, A\a2A---Aany1 =Q (N dimensions) . (13.1) 


13.2 Geometric product 


The inner and outer products offer two different ways of multiplying vectors. However, by itself neither 
product conforms to the usual desideratum of multiplication, that the product of two elements of a set be 
an element of the set. Taking the inner product of a vector with another vector lowers the dimension by one, 
while taking the outer product raises the dimension by one. 

Grassmann (1877) and Clifford (1878) resolved the problem by defining a multivector as any linear 
combination of scalars, vectors, bivectors, and objects of higher grade. Let -y1, Y2, ..., Yn form an orthonormal 
basis for N-dimensional Euclidean space R. A multivector in N = 2 dimensions is then a linear combination 
of 


1; Tis Y YAN, 


1 scalar 2 vectors 1 bivector (13.2) 


forming a linear space of dimension 1 + 2 + 1 = 4 = 2?. Similarly, a multivector in N = 3 dimensions is a 
linear combination of 


1, Yi, Y2 Ysi YAV, VAB, VAN, YAVAN, (13.3) 
1 scalar 3 vectors 3 bivectors 1 trivector 7 


forming a linear space of dimension 1 +3 +3+1 = 8 = 2°. In general, multivectors in N dimensions form a 
linear space of dimension 2%, with N!/[n!(N—n)!] distinct basis elements of grade n. 
A multivector a in N-dimensional Euclidean space R can thus be written as a linear combination of 
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basis elements 


a= 5 ated ya NY AA Ya (13.4) 
distinct {a,b,...,d} C {1,2,..., N} 


the sum being over all 2" distinct subsets of {1, 2, ..., N}. The index on each component a®- is a totally 
antisymmetric quantity, reflecting the total antisymmetry of Ya A Y A... A Ya- 

The point of introducing multivectors is to allow multiplication to be defined so that the product of two 
multivectors is a multivector. The key trick is to define the geometric product ab of two vectors a and b 
to be the sum of their inner and outer products: 


ab=a:b+a^b]. (13.5) 


That is a seriously big trick, and if you buy a ticket to it, you are in for a seriously big ride. As a particular 
example of (13.5), the geometric product of any element Ya of the orthonormal basis with itself is a scalar, 
and with any other element of the basis is a bivector: 


1 (a=b) 
a Yb = 13.6 

= oe (a#b). ( ) 
Conversely, the rules (13.6), plus distributivity, imply the multiplication rule (13.5). A generalization of the 
rule (13.6) completes the definition of the geometric product: 


Aa fo Aa = Ya \ Yo \ ANa (a, b,...,d all distinct) . (13.7) 


The rules (13.6) and (13.7), along with the usual requirements of associativity and distributivity, combined 
with commutativity of scalars and anticommutativity of pairs of Ya, uniquely define multiplication over the 
space of multivectors. For example, the product of the bivector yı A Y2 with the vector yı is 


(MAY) NM 5 NVN = -VNN =- - (13.8) 
Sometimes it is convenient to denote the outer product (13.7) of distinct basis elements by the abbreviated 
symbol Y4 Or Yab...d, 


YA = Yab...d = Ya NWA -A Ya (a, b,...,d all distinct) . (13.9) 


By construction, ya with A = ab...d is antisymmetric in its indices a,b,...,d. The product of two general 
multivectors a = a4, and b = b4y, is 


ab = a4b¥ yay , (13.10) 


with paired indices A and B implicitly summed over distinct subsets of {1,..., N}. By construction, the 
geometric algebra is associative, 
(ab)c = a(be) . (13.11) 
Does the geometric algebra form a group under multiplication? No. One of the defining properties of a 
group is that every element should have an inverse. But, for example, 


(l+m)(1—m) =0 (13.12) 
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shows that neither 1 + yı nor 1 — yı has an inverse. 


13.3 Reverse 


The reverse of any basis element is defined to be the reversed product 
Ya NY Ne NYa = Ya NVA Ya - (13.13) 
The product of a basis multivector y4 and its reverse is 1, 
yaa = Yaya =l. (13.14) 


The reverse @ of any multivector a is the multivector obtained by reversing each of its components. 
Reversion leaves unchanged all multivectors whose grade is 0 or 1, modulo 4, and changes the sign of all 
multivectors whose grade is 2 or 3, modulo 4. Thus the reverse of a multivector a of pure grade p is 


a = (-)P/la , (13.15) 


where [p/2] signifies the largest integer less than or equal to p/2. For example, scalars and vectors are 
unchanged by reversion, but bivectors and trivectors change sign. Reversion satisfies 


a+b=a+b, (13.16) 
ab = ba. (13.17) 


Among other things, it follows that the reverse of any product of multivectors is the reversed product, as 
you would hope: 


ab...c=€...ba. (13.18) 


13.4 The pseudoscalar and the Hodge dual 


Orthogonal to any n-dimensional subspace of N-dimensional space is an (N—n)-dimensional space, called 
the Hodge dual space. For example, the Hodge dual of a bivector in 2 dimensions is a 0-dimensional ob- 
ject, a pseudoscalar. Similarly, the Hodge dual of a bivector in 3 dimensions is a 1-dimensional object, a 
pseudovector. 


13.4.1 Pseudoscalar 


Define the pseudoscalar Jy in N dimensions to be 


In =NAY¥2A...A9N (13.19) 
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with reverse 
Ty = (-)RAIy , (13.20) 
where [N/2] signifies the largest integer less than or equal to N/2. The square of the pseudoscalar is 


1 if N = (0 or 1) modulo 4 
es = ae = { ( ) 


13.21 
—1 if N = (2 or 3) modulo 4 . eet) 


The pseudoscalar anticommutes (commutes) with vectors a, that is, with multivectors of grade 1, if N is 
even (odd): 


Inya = —alyn if Nis even 


Iya=aly if Nisodd. (13.22) 


This implies that the pseudoscalar Iy commutes with all even grade elements of the geometric algebra, and 
that it anticommutes (commutes) with all odd elements of the algebra if N is even (odd). Concisely, if a has 
grade p, then 


Iya = (-)/?"-Paln . (13.23) 


Exercise 13.1. Schur’s lemma. Prove that the only multivectors that commute with all elements of the 
algebra are linear combinations of the scalar 1 and, if N is odd, the pseudoscalar Iy. 

Solution. Suppose that a is a multivector that commutes with all elements of the algebra. Then in particular 
a commutes with every basis element Ya ^% ^... A^ Yq. Since multiplication by a basis element permutes 
the basis elements amongst each other (and multiplies each by +1), it follows that a commutes with a 
basis element only if each of the components of a commutes separately with that basis element. Thus each 
component of a must commute separately with all basis elements of the algebra. Amongst the basis elements 
of the algebra, only the scalar 1, and, if the dimension N is odd, the pseudoscalar Iy, equation (13.22), 
commute with all other basis elements. Thus a must be some linear combination of 1 and, if N is odd, the 
pseudoscalar In. 


13.4.2 Hodge dual 


The Hodge dual “a of a multivector a in N dimensions is defined by pre-multiplication by the pseudoscalar 
In, 
“a=Iyna. (13.24) 


In 3 dimensions, the Hodge duals of the basis vectors yz are the bivectors 
BY phi, eas Bena (13.25) 


Thus in 3 dimensions the bivector a^ b is seen to be the pseudovector Hodge dual to the familiar vector 
product a x b: 


a\b=I3axb. (13.26) 
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13.5 General products of multivectors 


13.5.1 Pure grade components of products of multivectors 


It is useful to be able to project out a particular grade component of a multivector. The grade p component 
of a multivector a is denoted 


(a), > (13.27) 


so that for example (a),, (a),, and (a), represent respectively the scalar, vector, and bivector components 
of a. By construction, a multivector is the sum of its pure grade components, a = (a), + (a); +... + (a) y- 

The geometric product of a multivector a of pure grade p with a multivector b of pure grade q is in general 
a sum of multivectors of grades |p—gq| to min(p+q, N). The product ab is in general neither commutative 
nor anticommutative, but the pure grade components of the product commute or anticommute according to 


(ab) = (—)P9-”* (ba) (13.28) 


ptq—2n ptq-—2n 


for n = [(p+q—N)/2] to min(p, q). Written out in components, the grade p+q—2n component of the geometric 
product of a = a4y, and b = b4y, is 


(ab) p4q—2n = (—)aACbePya Aye ; (13.29) 


implicitly summed over distinct sequences A, B, and C of respectively p—n, q—n, and n indices. The factor 
(—)!"/2] comes from the square of a grade-n orthonormal multvector, yoyo = (—)!"/?!. Only components 
with the p+q+n indices of ABC all distinct contribute. 

Equation (13.29) can also be written 


(p+q—2n)! ac, B 
ab) p4¢—2n = (=) al4CbgFla ap , 13.30 
implicitly summed over distinct sequences AB and C of respectively p+q—2n and n indices. The binomial 
factor is the number of ways of picking the p — n distinct indices of A and the q — n distinct indices of B 
from each distinct antisymmetric sequence AB of p+q—2n indices. 


13.5.2 Wedge product 


The wedge product of multivectors of arbitrary grade is defined, consistent with the convention of differential 
forms, §15.8, to be the highest possible grade component of the geometric product. The wedge product of a 
multivector a of grade p with a multivector b of grade q is thus defined to be 


a ^b = (ab)p4q | - (13.31) 


The definition (13.31) is consistent with the definition of the wedge product of vectors (multivectors of grade 
1) in §13.1. The wedge product is commutative or anticommutative as pq is even or odd, 


anb= (—)”b^a , (13.32) 
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which is a special case of equation (13.28). The wedge product is associative, 
(a\b)Ac=aA(bAc). (13.33) 


In accordance with the definition (13.31), the wedge product of a scalar a (a multivector of grade 0) with a 
multivector b equals the usual product of the scalar and the multivector, 


a\b=ab if ais ascalar , (13.34) 


again consistent with the convention of differential forms. 


13.5.3 Dot product 


The dot product of multivectors of arbitrary grade is defined to be the lowest grade component of their 
geometric product, 


a-b= (ab)ip—q| |; (13.35) 


except that the dot product of a scalar, a zero grade multivector, with any multivector is conveniently defined 
to be zero, 


a-b=0 ifaisascalar . (13.36) 


The convention (13.36) is adopted to ensure that, if b is a vector, then ab = a -b +a ^ b for any multivector 
a, including the case where a is a scalar. The dot product is symmetric or antisymmetric, 


a-b=(-)?-94b-a forp>q. (13.37) 


The dot product is not associative. 


13.5.4 Scalar product 


The dot product of two multivectors of the same grade is a scalar, and in this case the dot product can be 
called the scalar product. The scalar product of two multivectors of the same grade p is 


a-b=a4y,4- bP yg = (-)#/7la4bd, , (13.38) 


implicitly summed over distinct sequences A of p indices. Equation (13.38) is a special case of equation (13.29). 


13.5.5 Triple products of multivectors 


The associativity of the geometric product implies that the grade 0 component of a triple product of multi- 
vectors a, b, c of grades respectively p, q, r satisfies an associative law 


(abc)9 = ((ab),-c)o = (a(bc)p)o . (13.39) 
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nan 


Figure 13.2 Reflection of a vector a through axis n. 


More generally, the grade s component of a triple product of multivectors a, b, c of non-zero grades respec- 
tively p, q, r (any grade 0 multivector, i.e. scalar, can be taken outside the product) satisfies 


r+s pts 
(abe), = X` ((ab)ne)s= X` (a(be)n)s - (13.40) 
n=|r—s| n=|p—s| 


Often some terms vanish, simplifying the relation. As an example of the triple-product relation (13.40), if a 
and 6 are multivectors of grades p and q respectively, and neither are scalars, and their wedge product does 
not vanish (that is, p +q < N), then the wedge and dot products of a and b are related by Hodge duality 
relations 


In(aA b) = (Iva)-b, (aA b)Iy =a. (bIn) i (13.41) 


where Iy is the pseudoscalar (13.19). 


13.6 Reflection 


Multiplying a vector (a multivector of grade 1) by a vector shifts the grade (dimension) of the vector by 
+1. Thus, if one wants to transform a vector into another vector (with the same grade, one), at least two 


multiplications by a vector are required. 
The simplest non-trivial transformation of a vector a is 


n: a> nan, (13.42) 


in which the vector a is multiplied on both left and right with a unit vector n. If a is resolved into components 
aj and a, respectively parallel and perpendicular to n, then the transformation (13.42) is 


n: ajt+a,>a)—a,, (13.43) 
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which represents a reflection of the vector a through the axis n, a reversal of all components of the 
vector perpendicular to n, as illustrated by Figure 13.2. Note that —nan is the reflection of a through the 
hypersurface normal to n, a reversal of the component of the vector parallel to n. 

The operation of left- and right-multiplying by a unit vector n reflects not only vectors, but multivectors 
a in general: 


n: a>nan. (13.44) 
For example, the product ab of two vectors transforms as 
n: ab —> n(ab)n = (nan)(nbn) (13.45) 


which works because n? = 1. 
A reflection leaves any scalar \ unchanged, n : À > nàn = An? = à. Geometrically, a reflection preserves 
the lengths of, and angles between, all vectors. 


13.7 Rotation 


Two successive reflections yield a rotation. Consider reflecting a vector a (a multivector of grade 1) first 
through the unit vector m, then through the unit vector n: 


nm: a>nmamn . (13.46) 


Any component a] of a simultaneously orthogonal to both m and n (i.e. m-a} = n-a; = 0) is unchanged 
by the transformation (13.46), since each reflection flips the sign of a1: 


nm: aj > nma; mn = -nan =a]. (13.47) 


Rotations inherit from reflections the property of preserving the lengths of, and angles between, all vectors. 
Thus the transformation (13.46) must represent a rotation of those components aj of a lying in the 2-dim- 
ensional plane spanned by m and n, as illustrated by Figure 13.3. To determine the angle by which the 
plane is rotated, it suffices to consider the case where the vector aj is equal to m (or n, as a check). It is 
not too hard to figure out that, if the angle from m to n is 0/2, then the rotation angle is 0 in the same 
sense, from m to n. 


For example, if m and n are parallel, so that m = +n, then the angle between m and n is 0/2 = 0 or 7, 
and the transformation (13.46) rotates the vector aj by 0 = 0 or 27, that is, it leaves a) unchanged. This 
makes sense: two reflections through the same plane leave everything unchanged. If on the other hand m 
and n are orthogonal, then the angle between them is 0/2 = 7/2, and the transformation (13.46) rotates 
aij by 6 = +7, that is, it maps aj to —a). 

The rotation (13.46) can be abbreviated 


ie o> Bak (13.48) 


where R = nm is called a rotor, and R = mn is its reverse. Rotors are unimodular, satisfying RR = 
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mn 


Figure 13.3 Two successive reflections of a vector a, first through m, then through n, yield a rotation of a vector a 
by the bivector mn. Baffled? Hey, draw your own picture. 


RR = 1. According to the discussion above, the transformation (13.48) corresponds to a rotation by angle 
0 in the m-n plane if the angle from m to n is 6/2. Then m-n = cos6/2 and m An = (yA Ye) sin 0/2, 
where yı and ‘yp are two orthonormal vectors spanning the m-n plane, oriented so that the angle from +; 
to Yo is positive 7/2 (i.e. yı is the z-axis and y2 the y-axis). Note that the outer product yı ^ Yo is invariant 
under rotations in the m-n plane, hence independent of the choice of orthonormal basis vectors -y; and 72. 
It follows that the rotor R = nm = n-m+nAm corresponding to a right-handed rotation by 0 in the 
1-Y2 plane is given by 


R = cos — (y1 A72) sint ; (13.49) 


The rotor (13.49) can also be written as an exponential of the bivector 0 = 0 yı A92, 
R= e? , (13.50) 


It is straightforward to check that the action of the rotor (13.49) on the basis vectors ya is 


R: yı > Ry R = qı cos +72 sind , (13.51a) 
R: %2 Ry2R = %2 cos 0 — yı sind , (13.51b) 
R: Ya > hy R= (a #1,2), (13.51c) 


which corresponds to a right-handed rotation of the basis vectors Ya by angle 0 in the yı—y2 plane. The 
inverse rotation is 


R: a> RaR (13.52) 
with 


= 6 
R = cos 5 + (71 Aye) sin= . (13.53) 


A rotation of the form (13.49), a rotation in a single plane, is called a simple rotation. 
In the geometric algebra, a rotation is considered to rotate the axes Ya — y} while leaving the components 
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Y h 


Figure 13.4 Right-handed rotation of a vector a by angle 0 in the yı—~y2 plane. A rotation in the geometric algebra 
is an active rotation, which rotates the axes ya — y} while leaving the components a® of a multivector unchanged, 
equation (13.54). In other words, multivectors a are considered to be attached to the frame, and a rotation bodily 
rotates the frame and everything attached to it. 


a® of a multivector unchanged. Thus a rotation transforms a vector a as 
R: a=a°%, > a’ = a°). (13.54) 


Figure 13.4 illustrates a right-handed rotation by angle 0 of a vector a in the -y1—y2 plane. 
A rotation first by R and then by S transforms a vector a as 


SR: a>SRaRS=SRaSR. (13.55) 


Thus the composition of two rotations, first R and then S, is given by their geometric product SR. This is the 
physics convention, where rotations accumulate to the left (in contrast to the computer graphics convention, 
where rotations accumulate to the right). In three dimensions or less, all rotations are simple, but in four 
dimensions or higher, compositions of simple rotations can yield rotations that are not simple. For example, 
a rotation in the 1-2 plane followed by a rotation in the -y3-7y4 plane is not equivalent to any simple 
rotation. However, it will be seen in §14.3 that bivectors in the 4D spacetime algebra have a natural complex 
structure, which allows 4D spacetime rotations to take a simple form similar to (13.49), but with complex 
angle 0 and two orthogonal planes of rotation combined into a complex pair of planes. 
A rotor R rotates not only vectors, but multivectors a in general: 


. (13.56) 


For example, the product ab of two vectors transforms as 


R: ab > R(ab)R = (RaR)(RbR) (13.57) 


which works because RR = 1. 

To summarize, the characterization of rotations by rotors has considerable advantages. Firstly, the trans- 
formation (13.56) applies to multivectors a of arbitrary grade in arbitrarily many dimensions. Secondly, 
the composition law is particularly simple, the composition of two rotations being given by their geometric 
product. A third advantage is that rotors rotate not only vectors and multivectors, but also spin-$ objects 
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— indeed rotors are themselves spin-4 objects — as might be suspected from the intriguing factor of $ in 
front of the angle 0 in equation (13.49). 


Concept question 13.2. How fast do bivectors rotate? Rotors rotate half as fast as vectors. How fast 
do bivectors rotate? 
1. Bivectors don’t rotate. 
Half as fast as vectors. 
The same as vectors. 
Twice as fast as vectors. 


oe eee DI 


None of the above. 


13.8 Rotor group 


The rotor group is the group generated by the bivectors of the geometric algebra. The rotor group in N 
dimensions is also called Spin(N), and is the covering group of the special orthogonal group SO(N) of proper 
rotations in N dimensions (the S in SO(N) signifies special, that is, matrices of unit determinant, which 
removes improper rotations with determinant —1 that occur when a spatial axis is reflected). 

The rotor, or rotation, group is an example of a continuous group called a Lie group. A right-handed 
rotation exp(— 40 Ya A) by finite angle 0 in the y,—7y plane can be thought of as being built up from an 
infinite number of infinitesimal rotations exp(— 360 Ya NYb) by angles 60. To linear order, an infinitesimal 
rotation by angle 60 in the y—y plane is 


exp(— 480 Ya ANY) = 1 — $60 Ya AY - (13.58) 


The bivector —Ya ^ % is said to be the generator of a right-handed rotation in the ya~» plane. 
The Baker-Campbell-Hausdorff formula states that the product of exponentials of not-necessarily-commuting 
elements 0 and ¢ is 


exp(9) exp($) = exp (0 + b+ 310, 4] + 5 ll0, Ø], 4] — zz ll0, $], 0] + -..) , (13.59) 


where [0, p] = 0¢ — $6 is the commutator of 0 and œ, also called their Lie bracket. Thus finite rotations 
are built from exponentials of linear combinations of generators and their commutators. A set of linearly 
independent generators that close under commutation provides a basis for the Lie algebra of a Lie group. 
The commutator of two bivectors is a bivector, so the Lie algebra of rotations is the set of bivectors. The 
rotor group is the Lie group generated by the bivectors. 


Concept question 13.3. What is the dimension of the rotor group in N dimensions? Answer. 
The dimension of the rotor group is the number of its generators, its bivectors, which is N(N — 1)/2. 


Concept question 13.4. Is the rotor group the same as the group of even, unimodular elements 
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of the geometric algebra? All rotors are even, unimodular elements of the geometric algebra. The proper- 
ties of being even and unimodular are preserved under composition, so the set of even, unimodular elements 
forms a group. Is the rotor group the same as the group of even, unimodular elements? Answer. In low 
dimensions N < 5 yes, but in general no. See part 4 of Exercise 13.6. 


Exercise 13.5. The even geometric algebra in N+1 dimensions is isomorphic to the full geo- 
metric algebra in N dimensions. Show that the even geometric algebra in N+1 dimensions is isomorphic 
to the full geometric algebra in N dimensions. Conclude that the dimension of the even geometric algebra 
in N+1 dimensions is 2. 
Solution. Decompose a multivector a in N dimensions into its even and odd parts, a = Geyen + Qodd- The 
mapping 

Geven + Aodd ®© Geven + Godd YN+1 (13.60) 


is an isomorphism between the N-dimensional geometric algebra and the (N+1)-dimensional even algebra 
(Geven and aoda Yn+1 are both elements of the even algebra in N+1 dimensions). The mapping is an isomor- 
phism because it respects addition and multiplication, and it respects rotations that leave yy+1 invariant, 
that is, rotations in the N-dimensional geometric algebra. 


Exercise 13.6. Lie groups generated by multivectors. An element R of a Lie group generated by a set 
of multivectors y4 takes the form R = exp(—$ X 4 94a). The element R acts on elements a of the geometric 
algebra by R : a > RaR~!, where the inverse of R is R~' = exp($ X4 9a7a)- A set of multivectors 4 
generates a Lie group provided that the set is closed under commutation, in accordance with the Baker- 
Campbell-Hausdorff formula (13.59). Show that the non-zero commutators of two orthonormal multivectors 
of grades respectively p and q in N dimensions have grades p + q — 2n where 


n € [max(0, p+q—N), min(p, q)] (13.61) 


is an even integer if both p and q are odd, or an odd integer if either of p or q is even. In particular, show that 
the non-zero commutators of two orthonormal multivectors of the same grade p have grades 2 + 47 where 
j € (0, [(p—1)/2]] is an integer. Conclude that, if p denotes a multivector of grade p mod 4, then 


p, pl =ê, Ê, p =p, [6,1)=3, [6,3)=1, [i,3}=0. (13.62) 


Conclude that the following are Lie groups generated by multivectors in the geometric algebra. All groups 
preserve the scalar product of two multivectors. All groups have the rotor group as a subgroup. The notation 
G4(N) for the group generated by multivectors with grades modulo 4 in the set A follows Shirokov (2017). 
1. The rotor group, generated by bivectors. The rotor group acting on a multivector a preserves the grade 
of a. The dimension (number of generators) of the group is N(N—1)/2. 
2. The group generated by vectors and bivectors (multivectors of grades 1 and 2). The dimension of the 
group is N(N+1)/2. 
3. Pseudo versions of the above groups, namely: 
a. The group generated by bivectors and pseudobivectors, dimension N(N—1) for N > 5. 
b. The group generated by pseudovectors and bivectors, dimension N(N-+1)/2 for N > 4. 
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c. The group generated by vectors, pseudovectors, bivectors and pseudobivectors, dimension N(N+1) 
for N > 5. 


. The group G?(N) generated by multivectors of grade 2 mod 4 (thus grades 2,6, 10,...). The group may 


be called the even unimodular group since it is the largest group whose elements R are all even and 
unimodular, satisfying R7! = R. In dimensions N < 5, the even unimodular group coincides with the 
rotor group. The group preserves the grade p mod 4 of a multivector. The dimension of the group is 


= 1,2,3, 
dim G?(N) = 2-2/2 (2-0/2 4s), 5 =) 0 as (N+2) mod 8 = 4 0,4, (13.63) 
1 5,6,7. 


. The group G!?(N) generated by multivectors of grade (1 or 2) mod 4 (thus grades 1,2, 5,6,9, 10, ...). 


Define R to be the flip (grade involution) of R, defined by a —> —a for all odd multivectors a. The 
group is the largest group whose elements R all have inverses equal to their reverse flips (or flip reverses), 


R-t! = R. The dimension of the group is 


dim G'?(N) = dim G?(N+1) . (13.64) 


. The group G?3(N) generated by multivectors of grade (2 or 3) mod 4 (thus grades 2,3,6,7, 10, 11,...). 


The group may be called the unimodular group, since it is the largest group whose elements R are all 
unimodular, satisfying R7! = R. The dimension of the group is 


dim G?3(N) = 2071 — dim G?(N+1) + 2dim G?(N) 


-1 1,353; 
= 2N -0/2 (20N/2] 4 s), s=< 0 as (N+1) mod 8= 4 0,4, (13.65) 
1 5,6,7. 


. The even group G°?(N) generated by multivectors of grade 0 mod 2 (thus grades 0, 2, 4, 6, ...). The even 


group preserves the grade p mod 2 of a multivector (that is, whether the multivector is even or odd). 
The dimension of the group is 2’~!. The special even group SG (NV) is generated by even multivectors 
excluding the unit element (thus grades 2, 4, 6, ...). The dimension of the special even group is 2071 — 1. 


. The full group G°!?3(N) generated by multivectors of all grades (thus grades 0, 1, 2, 3, ...). The dimension 


of the group is 2%. The special even group SG°!?3(N) is generated by multivectors excluding the unit 
element (thus grades 1, 2,3,...). The dimension of the special group is 2% — 1. 


. There are also complex Lie groups in which some generators are permitted to be imaginary or complex. 


The complex Lie groups are: 
a. The complex rotor group generated by complex bivectors. 
b. The group generated by imaginary vectors and real bivectors. 
c. The group generated by complex vectors and complex bivectors. 
d 


. Pseudo versions of the above. 
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e. The remaining groups can be denoted G4‘? following Shirokov (2017), with real generators of 
grades A mod 4 and imaginary generators of grades B mod 4: 


Q22 s Q2ir , Q2Pi2p , Q2ri2p l G012320123 , (13.66) 


where p runs over 0,1,3, and 2p denotes the opposite of 2p (for example 20 = 13). 
Solution. The dimension of each Lie group G4(N), the number of its generators, is established as follows. 
Let mgp denote the number of multivectors of grade k mod 4, 


M, = 5 (5) . (13.67) 
p=k mod 4 P 


The binomial theorem implies (i is the imaginary) 


3 
AHPN =X im, forj=0to3, (13.68) 
k=0 
or explicitly 
Qn l1 1 1 1 mo 
(1+i)% 1 i -1 -i my 
= 13.69 
(1—i)% 1 -i -1 i m3 
Equation (13.68) inverts to 
1 Sk 4 
me = 5 "a +P)", (13.70) 
j=0 
or explicitly 
mo 1 1 1 1 ay 
mı 1f 1 —i -1 i (1+i)% 
he 13.71 
me 4| 1-1 1 -1 0 ieee 
m3 1 i -1 -i (1—i)% 


The dimensions of the Lie groups are 
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13.9 Active and passive rotations 
So far in this book, indices have indicated how an object transforms, so that the notation 
a” Ym (13.73) 


indicates a scalar, an object that is unchanged by a transformation, because the transformation of the 
contravariant vector a™ cancels against the corresponding transformation of the covariant vector Ym. 

However, the transformation (13.56) of a multivector is an example of an active transformation that rotates 
the basis vectors y4 while keeping the coefficients a^ fixed, as opposed to a passive transformation that 
rotates the tetrad while keeping the thing itself unchanged. An active rotation bodily rotates a multivector 
a, whereas a passive rotation rotates the frame without changing the multivector. Figure 13.4 illustrates the 
example of an active right-handed rotation by angle 0 in the -y,-7y2 plane, equations (13.51). 

Under an active rotation, a multivector a = a4y, (implicit summation over distinct antisymmetrized 
subsets A of {1,...,N}) is not a scalar under the transformation (13.56), but rather transforms to the 
multivector a’ = ay% given by 


R: afya > aARyaR = a47', . (13.74) 


13.10 A rotor is a spin-+ object 


A rotor is an even, unimodular element of the geometric algebra, §13.7. As a multivector, a rotor R would 
transform under a rotation by the rotor S as R > SRS. As a rotor, however, the rotor R transforms under 
a rotation by the rotor S as 


S: RoSR, (13.75) 
according to the transformation law (13.55). That is, composition in the rotor group is defined by the 
transformation (13.75): R rotated by S is SR. 


The expression (13.49) for a simple rotation in the -y1—-‘y2 plane shows that the rotor corresponding to a 
rotation by 27 is —1. Thus under a rotation (13.75) by 27, a rotor R changes sign: 


2r: RoR. (13.76) 
A rotation by 47 is necessary to bring the rotor R back to its original value: 
4r: ROR. (13.77) 


Thus a rotor R behaves like a spin-4 object, requiring 2 full rotations to restore it to its original state. 
The two different transformation laws for a rotor — as a multivector, and as a rotor — describe two 

different physical situations. The transformation of a rotor as a multivector answers the question, what is 

the form of a rotor R rotated into another, primed, frame? In the unprimed frame, the rotor R transforms 
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a multivector a to RaR. In the primed frame rotated by rotor S from the unprimed frame, a’ = SaS, the 
transformed rotor is SRS, since 


a’ = SaS > SRaRS = SRSa' SRS = SRSa'SRS . (13.78) 


By contrast, the transformation (13.75) of a rotor as a rotor answers the question, what is the rotor corre- 
sponding to a rotation R followed by a rotation S? 


13.11 2D rotations and complex numbers 


In N < 5 dimensions, the rotor group consists of even, unimodular multivectors of the geometric subalgebra, 
part 4 of Exercise 13.6. In two dimensions, the even grade multivectors are linear combinations of the basis 
set 

1, h, 


1 scalar 1 bivector (pseudoscalar) (13.79) 


forming a linear space of dimension 2. The sole bivector is the pseudoscalar Iz = yı A Y2, equation (13.19), 
the highest grade element in 2 dimensions. The rotor R that produces a right-handed rotation by angle 9 is, 
according to equation (13.49), 


R=e 9? = en? — cos £ -h sing i (13.80) 


where 0 = Iz 0 is the bivector whose magnitude is (00)!/? = 0. 
Since the square of the pseudoscalar Jz is minus one, the pseudoscalar resembles the pure imaginary i, the 
square root of —1. Sure enough, the mapping 


Lsi (13.81) 
defines an isomorphism between the algebra of even grade multivectors in 2 dimensions and the field of 
complex numbers 

a+lba+ib. (13.82) 


With the isomorphism (13.82), the rotor R that produces a right-handed rotation by angle @ is equivalent 
to the complex number 


R=e~/? |. (13.83) 


The associated reverse rotor R is 
R= eh, (13.84) 
the complex conjugate of R. The group of 2D rotors is isomorphic to the group of complex numbers of unit 
magnitude, the unitary group U(1), 
2D rotors ~ U(1) . (13.85) 
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Let z denote an even multivector, equivalent to some complex number by the isomorphism (13.82). Accord- 
ing to the transformation formula (13.56), under the rotation R = e~*°/?, the even multivector, or complex 
number, z transforms as 


R: z ez et — ehez- z (13.86) 


which is true because even multivectors in 2 dimensions commute, as complex numbers should. Equa- 
tion (13.86) shows that the even multivector, or complex number, z is unchanged by a rotation. This 
might seem strange: shouldn’t the rotation rotate the complex number z by 0 in the Argand plane? The 
answer is that the rotation R : a > RaR rotates vectors yı and y2 (Exercise 13.7), as already seen in the 
transformation (13.51). The same rotation leaves the scalar 1 and the bivector Iz = yı A %2 unchanged. If 
temporarily you permit yourself to think in 3 dimensions, you see that the bivector -y; A %2 is Hodge dual to 
the pseudovector yı X y2, which is the axis of rotation and is itself unchanged by the rotation, even though 
the individual vectors yı and %2 are rotated. 


Exercise 13.7. Rotation of a vector. Confirm that a right-handed rotation by angle 0 rotates the axes 
Ya by 
R: %9 > e 9/24, 9/2 — yı cos + %2 sinô , (13.87a) 
R: 427 Gee e?/2 = y2 cos — yı sinô , (13.87b) 


in agreement with (13.51). The important thing to notice is that the pseudoscalar Iz, hence i, anticommutes 
with the vectors Yq. 


13.12 Quaternions 
A quaternion can be regarded as a kind of souped-up complex number, 
q =a +1b1 + Jb2 + kbs , (13.88) 


where a and ba (a = 1,2,3) are real numbers, and the three imaginary numbers 2, 7, k, are defined to satisfy! 


? = 7? = k = k = 1. (13.89) 


Remark the dotless 2 (and 3), to distinguish these quaternionic imaginaries from other possible imaginaries. 
A consequence of equations (13.89) is that each pair of imaginary numbers anticommutes: 


wy=—po=—k, jk=-k}=—1, k= tk =-7. (13.90) 


1 The choice Jk = 1 in the definition (13.89) is the opposite of the conventional definition ijk = —1 famously carved by 
William Rowan Hamilton in the stone of Brougham Bridge while walking with his wife along the Royal Canal to Dublin on 
16 October 1843 (O’Donnell, 1983). To map to Hamilton’s definition, you can take 2 = —i, 7 = —j, k = —k, or alternatively 
t=; j, k =k, or 1 = k, J= j, k = i. The adopted choice 27k = 1 has the merit that it avoids a treacherous minus sign 
in the isomorphism (13.105) between 3-dimensional pseudovectors and quaternions. The present choice also conforms to the 
convention used by OpenGL and other computer graphics programs. 
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It is convenient to abbreviate the three imaginaries by ta with a = 1, 2,3, 


{2 3, k} = {t1, 22,23} . (13.91) 


The quaternion (13.88) can then be expressed compactly as a sum of its scalar a and vector (actually 
pseudovector, as will become apparent below from the isomorphism (13.105)) b = taba parts 


q=a+b=a+ taba, (13.92) 


implicitly summed over a = 1,2,3. A fundamentally useful formula, which follows from the defining equa- 
tions (13.89), is 
ab = (taaa) (wbb) = —a- b — a x b = —aaba — laEabcabbe , (13.93) 
where a-b and axb denote the usual 3D scalar and vector products, and £abc is the usual totally antisymmetric 
matrix, with £123 = 1. The product of two quaternions p = a + b and q = c + d can thus be written 
pq = (a+ b)(c + d) = (a + taba) (c + db) 
=ac—b-d+ad+cb—bx d= ac-— bada + talada + cba — Eabcbode) - (13.94) 


The quaternionic conjugate q of a quaternion q = a + b is (the overbar symbol ~ for quaternionic 
conjugation distinguishes it from the asterisk symbol * for complex conjugation) 


q=a—b=a-— taba. (13.95) 
The quaternionic conjugate of a product is the reversed product of quaternionic conjugates 
Dd = qp (13.96) 


just like reversion in the geometric algebra, equation (13.17). The choice of the same symbol, an overbar, 
to represent both reversion and quaternionic conjugation is not coincidental. The magnitude |q| of the 
quaternion q = a + b is 


lal = (Ga)? = (gq)? = (a? + b- b)? = (a? + Baba)? . (13.97) 

The magnitude of a quaternion is also called its modulus. A quaternion that has unit modulus, gq = 1, is 
called unimodular. The inverse q7! of the quaternion, satisfying qq~! = q~'q = 1, is 

qt =G/(Gq) = (a — b)/(a? + b - b) = (a — taba) /(a? + bibo) . (13.98) 


13.13 3D rotations and quaternions 


As before, in N < 5 dimensions, the rotor group consists of even, unimodular multivectors of the geometric 
subalgebra. In three dimensions, the even grade multivectors are linear combinations of the basis set 


1, Isy1, Is, [373 , 


13. 
1 scalar 3 bivectors (pseudovectors) (13.99) 
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forming a linear space of dimension 4. The three bivectors are pseudovectors, equation (13.25). The squares 
of the pseudovector basis elements are all minus one, 


(y) = (sy)? = (Iss)? = -1, (13.100) 
and they anticommute with each other, 
(By) By) = -By y) = -BY , 


(Is2)(Is¥3) = -Uy y2) = -B7 , (13.101) 
(373) 371) = —U371) 33) = -By - 


The rotor R that produces a rotation by angle @ right-handedly about unit direction na = {n1, n2; N3}, 
satisfying nana = 1, is, according to equation (13.49), 


0 0 
R=e 9? = e7” ?/2 — cos 37” sin 3l (13.102) 


where @ is the bivector 
06= nO = IYanað . (13.103) 
of magnitude (00)!//? = 0 and unit direction n = I3Yana (satisfying nN = 1). The pseudovector Tz is a 
commuting imaginary, commuting with all members of the 3D geometric algebra, both odd and even, and 
satisfying 
I =-1. (13.104) 
Comparison of equations (13.100) and (13.101) to equations (13.89) and (13.90), shows that the mapping 
Isa Gta (a= 1,2,3) (13.105) 
defines an isomorphism between the space of even multivectors in 3 dimensions and the non-commutative 
division algebra of quaternions 
a + IzYaba © a + taba - (13.106) 


With the equivalence (13.106), the rotor R given by equation (13.102) can be interpreted as a quaternion, 
with 0 the quaternion 


OSENI =UNab. (13.107) 


The associated reverse rotor R is 


R = e?/? = "9/2 cos £ +n sin $ f (13.108) 
the quaternionic conjugate of R. 

The group of rotors is isomorphic to the group of unimodular quaternions, quaternions q = a + 216; + 
t2b2 + 23b3 satisfying qq = a? + b? + b3 + b3 = 1. Unimodular quaternions evidently define a unit 3-sphere 
in the 4-dimensional space of coordinates {a, b1, b2, b3}. From this it is apparent that the rotor group in 3 
dimensions has the geometry of a 3-sphere S°. 
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Exercise 13.8. 3D rotation matrices. This exercise is a precursor to Exercise 14.9. The principal message 
of the exercise is that rotating using matrices is more complicated than rotating using quaternions. Let 
b = Yaba be a 3D vector, a multivector of grade 1 in the 3D geometric algebra. Use the quaternionic 
composition rule (13.93) to show that the vector b transforms under a right-handed rotation by angle 0 
about unit direction Nn = YaNa as 


E 0 0 0 
R: b> RBR=b+2sin5nx (cos5b+singn xb). (13.109) 
Here the cross-product n x b denotes the usual vector product, which is dual to the bivector product 
n ^b, equation (13.26). Suppose that the quaternionic components of the rotor R are {w, x,y,z}, that is, 
R = e702 9/2 — wae + izy +132. Show that the transformation (13.109) is (note that the 3 x 3 rotation 
matrix is written to the left of the vector, in accordance with the physics convention that rotations accumulate 
to the left): 


bin w +r? — y? 2" 2(xy—wz) 2(zx+wy) bin 
R: by | > 2(xy+wz) wr? +y? -2 2(yz—wzxr) bse ; (13.110) 
b33 2(zx—wy) 2(yzt+wa) wr? —y? +2? b3 Y3 
Confirm that the 3 x 3 rotation matrix on the right hand side of the transformation (13.110) is an or- 
thogonal matrix (its inverse is its transpose) provided that the rotor is unimodular, RR = 1, so that 
w? +r? +y? +z? =1. As a simple example, show that the transformation (13.110) in the case of a right- 
handed rotation by angle 0 about the 3-axis (the 1-2 plane), where w = cos(0/2) and z = —sin(@/2), 
is 
biyı cos sin@d 0 biyı 
R: boy2 | > | —sin@ cosð 0 bayz ; (13.111) 
b33 0 0 1 b393 


13.14 Pauli matrices 


The multiplication rules of the basis vectors Ya of the 3D geometric algebra are identical to those of the 
Pauli matrices o, used in the theory of non-relativistic spin-4 particles. 

The Pauli matrices form a vector of 2 x 2 complex (with respect to a scalar quantum-mechanical 
imaginary i) matrices whose three components are each traceless (Tr ca = 0), Hermitian (of = ca), and 


_f WO 1 _f0 =i gi 0 


unitary (a7! = 1): 
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The Pauli matrices anticommute with each other 
0102 = —0901 = 103 5 0203 = —0302 = 104 5 0301 = —0103 = 102 . (13.113) 


The particular choice (13.112) of Pauli matrices is conventional but not unique: any three traceless, Hermitian, 
unitary, anticommuting 2 x 2 complex matrices will do. The product of the 3 Pauli matrices is i times the 
unit matrix, 


.f 1 0 
010203 =i( 01 ) i (13.114) 


If the scalar 1 in the geometric algebra is identified with the unit 2 x 2 matrix, and the pseudoscalar 13 


is identified with the imaginary 7 times the unit matrix, then the 3D geometric algebra is isomorphic to the 
algebra generated by the Pauli matrices, the Pauli algebra, through the mapping 


1 0 {1 0 
14 (4 J: Ya Oa, evil j ae (13.115) 


The 3D pseudoscalar J3 commutes with all elements of the 3D geometric algebra. 


Concept question 13.9. Properties of Pauli matrices. The Pauli matrices are traceless, Hermitian, 
unitary, and anticommuting. What do these properties correspond to in the geometric algebra? Are all these 
properties necessary for the Pauli algebra to be isomorphic to the 3D geometric algebra? Are the properties 
sufficient? 


In 3 dimensions, the rotation group is the group of even, unimodular multivectors of the geometric algebra. 
The isomorphism (13.115) establishes that the rotation group is isomorphic to the group of complex 2 x 2 
matrices of the form 


a+ isaba, (13.116) 


with a, ba (a = 1,3) real, and with the unimodular condition requiring that a?+baba = 1. It is straightforward 
to check (Exercise 13.10) that the group of such matrices constitutes the group of unitary complex 2 x 2 
matrices of unit determinant, the special unitary group SU(2). The isomorphisms 


a + I3Yaba © a+ taba © a + isaba (13.117) 


have thus established isomorphisms between the group of 3D rotors, the group of unimodular quaternions, 
and the special unitary group of complex 2 x 2 matrices 


3D rotors = unimodular quaternions ~ SU(2) . (13.118) 


An isomorphism that maps a group into a set of matrices, such that group multiplication corresponds to 
ordinary matrix multiplication, is called a representation of the group. The representation of the rotation 
group as 2 x 2 complex matrices may be termed the Pauli representation. The Pauli representation is the 
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lowest dimensional representation of the 3D rotation group. In the Pauli representation, the rotor (13.102) 
corresponding to a right-handed rotation by angle 0 about unit axis na is the matrix 


6 6 
R = cos i inasa Sin J“ (13.119) 


Exercise 13.10. Translate a rotor into an element of SU(2). Show that the rotor R = e~t"«9/?, 
equation (13.102), corresponding to a right-handed rotation by angle @ about unit axis na is equivalent to 
the special unitary 2 x 2 matrix 


ie ant ; . 8 
cos 3 7 ins sin 5 — (n2 + in1) sin J 

Re ; 7 PE (13.120) 
(no — inı) sin z CS3 + ing sin 3 


Show that the reverse rotor R is equivalent to the Hermitian conjugate RÝ of the corresponding 2 x 2 matrix. 
Show that the determinant of the matrix equals RR, which is 1. 


13.15 Pauli spinors as quaternions, or scaled rotors 


Any Pauli spinor y can be expressed uniquely in the form of a 2 x 2 matrix q, the Pauli representation of 
a quaternion q, acting on the spin-up basis element e} (the precise translation between Pauli spinors and 
quaternions is left as Exercises 13.11 and 13.12): 


p= qe. (13.121) 


In this section (and in the Exercises) the 2 x 2 matrix q is written in boldface to distinguish it from the 
quaternion q that it represents, but the distinction is not fundamental. A quaternion can always decomposed 
into a product q = AR of a real scalar À and a rotor, or unimodular quaternion, R. The real scalar À can be 
taken without loss of generality to be positive, since any minus sign can be absorbed into a rotation by 27 
of the rotor R. Thus a Pauli spinor ọ can also be expressed as a scaled rotor AR acting on the spin-up basis 
element €z, 


p=ARe. (13.122) 


One is used to thinking of a Pauli spinor as an intrinsically quantum-mechanical object. The map- 
ping (13.121) or (13.122) between Pauli spinors and quaternions or scaled rotors shows that Pauli spinors also 
have a classical interpretation: they encode a real amplitude A, and a rotation R. This provides a mathemat- 
ical basis for the idea that, through their spin, fundamental particles “know” about the rotational structure 
of space. 

The isomorphism between the vector spaces of Pauli spinors and quaternions does not extend to multipli- 
cation; that is, the product of two Pauli spinors y1 and ye equivalent to the complex 2 x 2 matrices qı and q2 
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does not equal the Pauli spinor equivalent to the product qiq2. The problem is that the Pauli representation 
of a Pauli spinor y is a column vector, and two column vectors cannot be multiplied. The question of how 
to multiply spinors is deferred to Chapter 38 on the super geometric algebra. 

Meanwhile, it is possible to multiply a row spinor and a column spinor. The spinor @ reverse to the 
spinor (13.121) is defined to be the row spinor 


PEET, (13.123) 


where g is the matrix representation of the reverse q of the quaternion q, and e is the transpose of the 
column spinor €+, which is the row spinor 


e&e=(1 0). (13.124) 


The scalar product Gy is real and positive, equation (13.133). It is legitimate to multiply a row spinor @ by 
a column spinor x, yielding a complex number. The product x is a scalar under spatial rotations, 


R: Ox > GRRX = & , (13.125) 


and therefore provides a viable definition of a scalar product of Pauli spinors. The problem of defining a 
scalar product of Pauli spinors is resumed in §38.6. 


Exercise 13.11. Translate a Pauli spinor into a quaternion. Given any Pauli spinor 
i 4 gt 
p =Y Et t p~e = gt , (13.126) 


show that the corresponding real quaternion q, and the equivalent 2 x 2 complex matrix q in the Pauli 
representation (13.112), such that p = qe, are 


+ | i + gt =o” 
q= {Rey , Img’, —Rey’, Imy } Oo gq= Lo g" : (13.127) 
p 
Show that the reverse quaternion q and the equivalent 2 x 2 matrix q in the Pauli representation are 
gh ol 
q= {Reg', —Imy*, Rey’, - Img} 6 q= 1 + i (13.128) 
me, p 


Conclude that the reverse matrix g equals its Hermitian conjugate, q = q', and that the reverse Pauli spinor 
@ defined by equation (13.123) is 


P=aqg=eqg=(¢* o )=¢g!l. (13.129) 


Exercise 13.12. Translate a quaternion into a Pauli spinor. Show that the quaternion q = w +1% + 
Jy + kz is equivalent in the Pauli representation (13.112) to the 2 x 2 matrix q 


wtiz wt+y ) (13.130) 


= e = 
q={w, z, y, z}O4q Gs esis 
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Conclude that the Pauli spinor y = q €} corresponding to the quaternion q is 


w+ tz 
p=Hqg= ( Paen ) j (13.131) 


and that the reverse spinor y defined by equation (13.123) is 
P= =q =(w-iz -ir-y). (13.132) 
Hence conclude that Gy is the real positive scalar magnitude squared \? = qq of the quaternion q, 
ge=ylp=aq=, (13.133) 
with 
Maw? +a? +y7 +27. (13.134) 


Exercise 13.13. Can a Pauli spinor be rotated into its complex conjugate? Can a Pauli spinor y 
be rotated into its complex conjugate y*? 

Solution. Yes. The question is, does there exist a rotor R such that Ry = y*? If q and q* are the quaternions 
equivalent to y and y*, then 


R=q¢ q". (13.135) 


More generally, a Pauli spinor may be rotated into any other Pauli spinor of the same modulus. 


13.16 Spin axis 


In the Pauli representation, the spinor basis elements €, are eigenvectors of the Pauli operator o3 with 
eigenvalues +1, 


O3E = +E j 03E} = =E, x (13.136) 


The spin axis of a Pauli spinor vy is defined to be the direction along which the Pauli spinor is pure up. 
In the Pauli representation, the spin axis of the spin-up basis spinor e+ is the positive 3-axis, while the spin 
axis of the spin-down basis spinor e} is the negative 3-axis. The spin axis of a Pauli spinor y = AR ez is the 
unit direction {n1, n2, n3} of the rotated 3-axis, given by 


Cana = RoR. (13.137) 
Equation (13.137) is confirmed by the fact that Cana has eigenvalue +1 acting on ¢: 


Cana P = (Ro3R) (AR e+) = ARoz3 e = ARE =p. (13.138) 
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Exercise 13.14. Orthonormal eigenvectors of the spin operator. Show that, in the Pauli representa- 
tion, the orthonormal eigenvectors €+, and En of the spin operator Cana projected along the unit direction 


1 1 1 | 
em = l tns js cin = me ( cae (13.139) 


2(1+n3) \ nı + ine 2(1—n3) \ nı + inz 


{n1, N2, nz} are 


14 


The spacetime algebra 


The spacetime algebra is the geometric algebra in Minkowski space. This Chapter is restricted to the case 
of 4-dimensional Minkowski space, but the formalism generalizes to any number of dimensions where some 
of the dimensions are timelike and the others are spacelike. Happily, the elegant formalism of the geometric 
algebra carries through to the spacetime algebra. See Exercise 39.5 for the general case of K space dimensions 
and M time dimensions. 


14.1 Spacetime algebra 


Let Ym (m = 0,1,2,3) denote an orthonormal basis of spacetime, with yo representing the time axis, and 
Ya (a = 1,2,3) the spatial axes. Geometric multiplication in the spacetime algebra is defined by 


Yn Yn = Ym + Yn Ym |, (14.1) 


just as in the geometric algebra, equation (13.5). The key difference between the spacetime basis 7, and 
Euclidean bases is that scalar products of the basis vectors %m form the Minkowski metric nmn, 


Ym ` Yn = Nm ; (14.2) 


whereas scalar products of Euclidean basis elements Ya formed the unit matrix, Ya-Y» = dap, equation (13.6). 
In less abbreviated form, equations (14.1) state that the geometric product of each basis element with itself 
is 


—%=N=%=%=1, (14.3) 
while geometric products of different basis elements Ym anticommute 
Ym In == MIm = Mmh (MN). (14.4) 
In the Dirac theory of relativistic spin-4 particles, §14.7, the Dirac y-matrices are required to satisfy 


{Ym Vn} = 2 Nmn (14.5) 
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where {} denotes the anticommutator, {Ym, Yn} = YmYn + nym. The multiplication rules (14.5) for the 
Dirac 7-matrices are the same as those for geometric multiplication in the spacetime algebra, equations (14.3) 
and (14.4). 
A 4-vector a, a multivector of grade 1 in the geometric algebra of spacetime, is 
a =a Ym = Yo Fay +a q +0743 - (14.6) 
Such a 4-vector a would be denoted ø in the Feynman slash notation. The product of two 4-vectors a and 
bis 
ab=a-b+aNb= aD Am: Yn tar b Ym An = OO mn + sab" [Ym, Yn] - (14.7) 
It is convenient to denote three of the six bivectors of the spacetime algebra by Sa, 
Oa = VOYa (a= 1, 2,3) $ (14.8) 
The symbol o, is used because the algebra of bivectors a, is isomorphic to the algebra of Pauli matrices ca. 
The pseudoscalar, the highest grade basis element of the spacetime algebra, is denoted I 
Yo Yı Y2 Y3 = 0190203 = I. (14.9) 
The pseudoscalar I satisfies 
P=-1, Iy¥m=—Yml, Ioa =al. (14.10) 
The basis elements of the 4-dimensional spacetime algebra are then 


1, Ym, Oa, Ioa, Im, L, 


14.11 
1 scalar 4 vectors 6 bivectors 4 pseudovectors 1 pseudoscalar ( ) 


forming a linear space of dimension 1+4+6+4+1 = 16 = 2+. The reverse is defined in the usual 
way, equation (13.13), leaving unchanged multivectors of grade 0 or 1, modulo 4, and changing the sign of 


multivectors of grade 2 or 3, modulo 4: 
I=1, YJm=Ym,;, Ta=-0a, Iep= top, Iy, =-I%m, Pal: (14.12) 


In the 3D geometric algebra a bivector was also a rotor, satisfying RR = 1, but in the 4D spacetime algebra 
only the spatial bivectors Jog are rotors, satisfying Ioaloa = 1. The boost bivectors satisfy aTa = —1 not 


1, so are not rotors. Nevertheless, if 0ga is a boost bivector, then its exponential R = e~°?2/? is a rotor, 
6/2)? 6/2)8 
R =e 970/2 — 1 — (6/2)og+ ( 2 £ Oa +... = cosh (0/2) — o, sinh(0/2) , (14.13) 
since its inverse is indeed its reverse R = e~°7«/? = €9%«/? = cosh(0/2) + oa sinh(0/2). 
The mapping 
YE) a (a=1,2,3) (14.14) 


(the superscript (3) distinguishes the 3D basis vectors from the 4D spacetime basis vectors) defines an iso- 
morphism between the 8-dimensional geometric algebra (13.3) of 3 spatial dimensions and the 8-dimensional 
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even spacetime subalgebra. Among other things, the isomorphism (14.14) implies the equivalence of the 3D 
spatial pseudoscalar [3 and the 4D spacetime pseudoscalar J, 
Ize], (14.15) 


since I3 = Oy Pay) and I = 010203. 


14.2 Complex quaternions 
A complex quaternion (also called a biquaternion by W. R. Hamilton) is a quaternion 
q=atb=at ba , (14.16) 
in which the four coefficients a, ba (a = 1, 2,3) are each complex numbers 
a=ar+ lar, ba = ba r+ Ibar . (14.17) 


The imaginary J is taken to commute with each of the quaternionic imaginaries ta. The choice of symbol I 
is deliberate: in the isomorphism (14.33) between the even spacetime algebra and complex quaternions, the 
commuting imaginary I is isomorphic to the spacetime pseudoscalar I. 

All of the equations in §13.12 on real quaternions remain valid without change, including the multiplication, 
conjugation, and inversion formulae (13.93)—(13.98). The quaternionic conjugate q of a complex quaternion 
q = a+ bis conjugated with respect to the quaternionic imaginaries 2,, but the complex coefficients a and 
ba are not conjugated with respect to the complex imaginary J, 


q=a+b=a-—b=a-— taba. (14.18) 


The modulus |q| of a complex quaternion q = a + b, 


lal = (aa)? = (gq)? = (a? + b - b)? = (a? + Dada)? , (14.19) 
is a complex number, not a real number. The name modulus to denote |q| is preferred over magnitude, to 
avoid confusion with the magnitude of a complex number. A quaternion is said to be unimodular if its 


modulus is 1, 


qq=1. (14.20) 


The unimodular condition (14.20) is a complex condition, stating that the real and imaginary (with respect 
to I) parts of qq are respectively 1 and 0. 

The complex conjugate q* of the complex quaternion is (the star symbol * is used for complex conjugation 
with respect to the pseudoscalar J, to distinguish it from the asterisk symbol * for complex conjugation with 
respect to the scalar quantum-mechanical imaginary i) 


q% = a* + b* = a* + ab% , (14.21) 
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in which the complex coefficients a and ba are conjugated with respect to the imaginary J, but the quaternionic 
imaginaries 2, are not conjugated. 

A non-zero complex quaternion can have zero modulus (unlike a real quaternion), in which case it is null. 
The null condition 


qq = a? + baba = 0 (14.22) 
is a complex condition. The product of two null complex quaternions is a null quaternion. Under multiplica- 


tion, null quaternions form a 6-dimensional subsemigroup (not a subgroup, because null quaternions do not 
have inverses) of the 8-dimensional semigroup of complex quaternions. 


Exercise 14.1. Null complex quaternions. Show that any non-trivial null complex quaternion q can be 
written uniquely in the form 


q=p(l+In) = p(1 + Iana), (14.23) 


where p is a real quaternion, and n = lana is a real unimodular vector quaternion, with real components 
{n1, n2, ng} satisfying Nana = 1. Equivalently, 


q=(14+In')p= (1 + Itani,)p , (14.24) 
where n’ is the real unimodular vector quaternion 


, pnp 


n= > (14.25) 
|p| 
with real components {n/, n3, n3} satisfying nhn = 1. 
Solution. Write the null quaternion q as 
q=p+1r (14.26) 


where p and r are real quaternions, both of which must be non-zero if q is non-trivial. Then equation (14.23) 
is true with 
pr 
M= tatia = x A (14.27) 
Ip| 
The null condition is gq = 0. The vanishing of the real part, Re(qq) = pp — Tr = 0, shows that Ip]? = Ir|?. 
The vanishing of the imaginary (J) part, Im(qq) = pr +7p = pr + pr = 0 shows that the pr must be a 
pure quaternionic imaginary, since the quaternionic conjugate of pr is minus itself, so pr / | pl? must be of the 


form n = tana. Its squared modulus NN = nana = PrTp/ In|" = 1 is unity, so n is a unimodular 3-vector 
quaternion. It follows immediately from the manner of construction that the expression (14.23) is unique, as 
long as q is non-trivial. 


Exercise 14.2. Nilpotent complex quaternions. An object whose square is zero is said to be nilpotent. 
Show that a complex quaternion of the form 


Gq=%da with q:q=dada =0 (14.28) 
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is nilpotent, 
e =0. (14.29) 


Prove that a nilpotent complex quaternion must take the form (14.28). The set of nilpotent complex quater- 
nions forms a 4-dimensional subspace of complex quaternions, since the complex condition qaqa = 0 elimi- 
nates 2 of the 6 degrees of freedom in the quaternionic components qq. The product of two nilpotent complex 
quaternions is not necessarily nilpotent, so the nilpotent set does not form a semigroup. The set of nilpotent 
complex quaternions consists of the subset of null complex quaternions that are purely quaternionic. 


14.3 Lorentz transformations and complex quaternions 


Lorentz transformations are rotations of spacetime. The rotor group of spacetime rotations in 3+1 dimensions 
is, as usual, the Lie group generated by the Lie algebra of bivectors. The rotor group in 3+1 dimensions is 
called Spin(3, 1). 

The basis elements of the even spacetime algebra are 


1 Oa, lo I 
, a) a) 3 14. 
1 scalar 6 bivectors 1 pseudoscalar (taan) 


forming a linear space of dimension 1 + 6 + 1 = 8 over the real numbers. However, it is more elegant to 
treat the even spacetime algebra as a linear space of dimension 8/2 = 4 over complex numbers of the form 
\ = àg + IAz. The pseudoscalar I qualifies as an imaginary because J? = —1, and because it commutes with 
all elements of the even spacetime algebra. It is convenient to take the basis elements of the even spacetime 
algebra over the complex numbers to be 


1, IOa, 


14.31 
1 scalar 3 bivectors aan 


forming a linear space of dimension 1 + 3 = 4. The reason for choosing Ioa rather than o, as the elements 
of the basis (14.31) is that the basis {1, Ioa} is equivalent to the basis (13.99) of the even algebra of 3-dim- 
ensional Euclidean space through the isomorphism (14.14) and (14.15). This basis in turn is equivalent to 
the quaternionic basis {1,74} through the isomorphism (13.105): 


Tog © Igy) 1g (a =1,2,3). (14.32) 


In other words, the even spacetime algebra is isomorphic to the algebra of quaternions with complex coeffi- 
cients: 
a+ IOaba © at taba (14.33) 


where a = ap + Iar is a complex number, and ba = ba,r + Iba,7 is a triple of complex numbers. 
The isomorphism (14.33) between even elements of the spacetime algebra and complex quaternions implies 
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that the group Spin(3, 1) of Lorentz rotors, which are unimodular elements of the even spacetime algebra, is 
isomorphic to the group of unimodular complex quaternions 


spacetime rotors ¥ unimodular complex quaternions . (14.34) 


In §13.13 it was found that the group of 3D spatial rotors is isomorphic to the group of unimodular real 
quaternions. Thus Lorentz transformations are mathematically equivalent to complexified spatial rotations. 

The Lorentz rotor that produces a rotation by complex angle 0 about the unimodular complex direction 
Na is, according to equation (13.49), 


6 
R=e-9/? = e7” ?/2 — cos z7r sin E (14.35) 
generalizing the 3D rotor (13.102). Here @ is a bivector 
6=nb=TLoqn , (14.36) 


whose modulus is the complex angle (00)1/2 = 0 = 0r + I0r, and whose direction is the unimodular complex 
bivector n = npr + Inz. The unimodular condition nn = 1 on n is equivalent to the complex condition 
NaNa = 1 on the complex components Nna = {n1, n2, n3}. The real and imaginary parts of the unimodular 
condition imply the two conditions 


Na,R Na,R = Na, I Na, I = ; 2nR,a NI a = 0 è (14.37) 


The complex angle 0 has 2 degrees of freedom, while the complex unimodular bivector n has 4 degrees of 
freedom, so the Lorentz rotor R has 6 degrees of freedom, which is the correct number of degrees of freedom 
of the group of Lorentz transformations. 

With the equivalence (14.32), the Lorentz rotor R given by equation (14.35) can be reinterpreted as a 
complex quaternion, with @ the complex quaternion 


6=Nn8 = and , (14.38) 


whose complex modulus is 6 = |0| = (80)!/? and whose complex unimodular direction is n = tana. The 


associated reverse rotor R is 


= 6 
R = e?/? = e” 9/2 — cos 5 +n sing (14.39) 


the quaternionic conjugate of R. Note that 0 and n in equation (14.39) are not conjugated with respect to 
the imaginary I. The sine and cosine of the complex angle 0 appearing in equations (14.35) and (14.39) are 
related to its real and imaginary parts in the usual way, 


On Or . OR. , Or 
cos 5 = COs -5 cosh 5 Isin 5 sinh eo 


BS ale Or 0r. OF 
sin 5 =s- cosh > +I cos sinh > - (14.40) 


For the case of a pure spatial rotation, the angle 0 = 6g and axis n = np in the rotor (14.35) are both 
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Figure 14.1 Lorentz boost of a vector a by rapidity 0 in the yo~yı plane. See Exercise 14.3. 


real. The rotor corresponding to a pure spatial rotation by angle Or right-handedly about unit real axis 
NR = lOaNa,R = taNa,R is the real quaternion 
0 


0 
R = e7”RÎR/2 — cos a — npg sin = f (14.41) 


A Lorentz boost is a change of velocity in some direction, without any spatial rotation, and represents 
a rotation of spacetime about some time-space plane. For example, a Lorentz boost along the y-axis (the 
x-axis) is a rotation of spacetime in the -yo—-y, plane (the t-x plane). In the case of a pure Lorentz boost, 
the angle 0 = I0; is pure imaginary, but the axis n = np remains pure real (alternatively, the angle is pure 
real and the axis is pure imaginary). The rotor corresponding to a boost by rapidity 67, or equivalently by 
velocity v = tanh 07, in unit real direction nz = Iogna,r = taNa,R is the complex quaternion 


0 6 
R =e irr 1/2 — cosh = — Inp sinh a . (14.42) 


Exercise 14.3. Lorentz boost. A Lorentz boost by rapidity 0 = atanh v along the y-axis (z-axis) (that 
is, a rotation in the yọ~yı plane) is given by the Lorentz rotor 


6 0 
R = e™7™ 9/2 — cosh 5 + yo A771 sinh 5° (14.43) 


Confirm that the Lorentz boost transforms the axes Ym as 


R: yo > RyR=-7o0 cosh 0 + yı sinhé , (14.44a) 
R: yı > RR = qı cosh 0 + yo sinh 8 , (14.44b) 
R: Ya > RyaR=7. (a#0,1). (14.44c) 


The boost is illustrated in Figure 14.1. 
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Exercise 14.4. Factor a Lorentz rotor into a boost and a rotation. Factor a general Lorentz rotor 
R = e7"2 9/2 into the product LU of a pure spatial rotation U followed by a pure Lorentz boost L. Do the 
two factors commute? 

Solution. Expand the rotor R as 


R=p+Iq (14.45) 


where p and q are real quaternions. Then R can be expressed as the composition of a pure spatial rotation 
U followed by a pure Lorentz boost L 


R=LU (14.46) 


in which 
a 
Ip| 
where |p| = (pp)'/? is the (real) absolute value of the real quaternion p. It is straightforward to check that 
U and L satisfy the requirements to be pure spatial and boost rotors. The spatial rotor U is by construction 
unimodular, UU = 1, and it follows that the boost rotor L = RU is also unimodular, since R is unimodular. 
The spatial rotor U is a real quaternion, and therefore satisfies the form (14.41) of a pure spatial rotation. 
The real part |p| of the boost rotor L is pure real, while the imaginary part qp/ |p| is a pure quaternionic 
imaginary, since unimodularity RR = 1 implies that Im(RR) = qp + pg = qp + qP = 0, i.e. the quaternionic 
conjugate of gp is minus itself. Thus L satisfies the form (14.42) of a pure Lorentz boost. 

The factors U and L commute if the boost and rotation axes are in the same direction, but not in general. 
The expression for the rotor R as the composition of a Lorentz boost followed by a spatial rotation, the 
opposite order to (14.46), is 


L= |p +1 Z (14.47) 


U = Eia 
Ip| 


R=UL (14.48) 
where U is the same spatial rotor as before, but the boost rotor L’ is 
L’=|pl| +I 7 = ULU (14.49) 
P 


whose real part |p| is the same as for L, but whose imaginary part pq/ |p| differs in direction, though not 
magnitude, from that of L. 


Exercise 14.5. Topology of the group of Lorentz rotors. Show that the geometry of the group of 
Lorentz rotors is the product of the geometries of the spatial rotation group and the boost group, which is 
a 3-sphere times Euclidean 3-space, 9° x R3. 


14.4 Spatial inversion (P) and Time reversal (T) 


Spatial inversion, or P for parity, is the operation of reflecting a (single) spatial direction, Ya > —Ya. Spatial 
inversion leaves the scalar product of orthonormal vectors unchanged. A rotation in N spatial dimensions 
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can be represented by a matrix in the orthogonal group O(N) of matrices satisfying the condition that 
their inverses are their transposes, O~' = O! . Since transposing a matrix leaves its determinant unchanged, 
orthogonal matrices have squared determinant equal to 1. The orthogonal group O(N) thus splits into two 
disconnected pieces, proper and improper rotations represented by orthogonal matrices of determinant 
respectively +1 and —1. The subgroup group of proper rotations is designated SO(N), the S signifying 
Special, meaning matrices of determinant 1. 

Inversion of one spatial direction can be represented by a diagonal orthogonal matrix with one of its 
diagonal elements equal to —1 and the remainder all 1. Thus spatial inversion is a discrete transformation 
of the geometric algebra, which splits the geometric algebra into two disconnected parts that cannot be 
transformed into each other by any continuous rotation. 

Inversion may be accomplished by reflecting through any odd number of spatial axes. In spacetimes with 
an odd number of spatial dimensions, as here (where there are 3 spatial dimensions), spatial inversion may 
be accomplished by reflecting all spatial vector basis elements Ya — —Ya, while keeping the time vector 
basis element yo unchanged. This results in Ca > —o, and I — —I. The equivalence Io, © ta means that 
the quaternionic imaginaries 2, are unchanged. Thus, if multivectors in the spacetime algebra are written as 
linear combinations of products of Yo, ta, and J, then spatial inversion P corresponds to the transformation 


P: W>, au, I>-I. (14.50) 


In other words spatial inversion may be accomplished by the rule, take the complex conjugate (with respect 
to I) of a multivector. 

Time reversal, or T, is the operation of reversing the time direction while keeping all spatial directions un- 
changed. Time reversal, like spatial inversion, leaves the scalar product of orthonormal vectors unchanged. 
Time reversal cannot be accomplished by any continuous Lorentz transformation starting from the unit 
element, nor can it be accomplished by spatial inversion accompanied by any continuous Lorentz transfor- 
mation starting from the unit element. Thus the Lorentz group contains 4 disconnected components that 
cannot be transformed into each other by any continuous Lorentz transformation starting from the unit 
element. The normal and reversed time components of the Lorentz group are sometimes called respectively 
orthochronous and antichronous. 

Time reversal may be accomplished by reflecting the time vector basis element yo + —*Yo, while keeping 
the spatial vector basis elements Ya unchanged. As with spatial inversion, this results in Oa > —oq and 
I — —I, which keeps Io, hence ta unchanged. If multivectors in the spacetime algebra are written as linear 
combinations of products of Yo, ta, and J, then time inversion T corresponds to the transformation 


T: Y>, aua, Io-I. (14.51) 


For any multivector, time inversion corresponds to the instruction to flip Yo and take the complex conjugate 
(with respect to J). 
The combined operation PT of inverting both space and time directions corresponds to 


PT: Ye ey Bay I>l. (14.52) 
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For any multivector, spacetime inversion corresponds to the instruction to flip yo, while keeping ta and I 
unchanged. 


14.5 How to implement Lorentz transformations on a computer 


The advantages of quaternions for implementing spatial rotations are well-known to 3D game programmers. 
Compared to standard rotation matrices, quaternions offer increased speed and require less storage, and 
their algebraic properties simplify interpolation and splining. Complex quaternions retain similar advantages 
for implementing Lorentz transformations. They are fast, compact, and straightforward to interpolate or 
spline (Exercises 14.6 and 14.8). Moreover, since complex quaternions contain real quaternions, Lorentz 
transformations can be implemented simply as an extension of spatial rotations in 3D programs that use 
quaternions to implement spatial rotations. 

Lorentz rotors, 4-vectors, spacetime bivectors, and spinors (spin-4 objects) can all be implemented as 
complex quaternions. A complex quaternion 


q=wt+yxt wy + 232 (14.53) 


with complex coefficients w, x, y, z (so w = wrt Iwz, etc.) can be stored as the 8-component object 
q= { oe See \ (14.54) 


Actually, OpenGL and other computer software store the scalar (w) component of a quaternion in the last 
(fourth) place, but here the scalar components are put in the zeroth position to conform to standard physics 
convention. The quaternion conjugate ¢ of the quaternion (14.54) is 


gd S E ane, ae \ , (14.55) 
wr =r YI 2I 


while its complex conjugate q* (with respect to T) is 


r=f WR ©R YR ZR Z (14.56) 


WI TI YI ZI 


A Lorentz rotor R corresponds to a complex quaternion of unit modulus. The unimodular condition RR = 
1, a complex condition, removes 2 degrees of freedom from the 8 degrees of freedom of complex quaternions, 
leaving the Lorentz group with 6 degrees of freedom, which is as it should be. Spatial rotations correspond 
to real unimodular quaternions, and account for 3 of the 6 degrees of freedom of Lorentz transformations. A 
spatial rotation by angle 0 right-handedly about the 1-axis (the z-axis) is the real Lorentz rotor 


R = e™®/? = cos(9/2) — u sin(0/2) , (14.57) 
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or, stored as a complex quaternion, 


R={ i) m e ° 0 \ . (14.58) 


Note that this is the physics convention, where a right-handed rotation corresponds to R = e7*2@/2 


and rotations accumulate to the left. The convention in OpenGL and other graphics software is that R = 


etanað/2 


and rotations accumulate to the right. To change to OpenGL convention, omit the minus sign 
in equation (14.58). Lorentz boosts account for the remaining 3 of the 6 degrees of freedom of Lorentz 
transformations. A Lorentz boost by velocity v, or equivalently by rapidity 0 = atanh(v), along the 1-axis 


(the z-axis) is the complex Lorentz rotor 


R = e™™?/2 — cosh(0/2) — In sinh(0/2) , (14.59) 
or, stored as a complex quaternion, 
cosh(0/2) 0 0 0 
R= : 14.60 
{ 0 —sinh(@/2) 0 0 \ ( ) 


Again, this is the physics convention. To change to OpenGL convention, omit the minus sign in equa- 
tion (14.60). The rule for composing Lorentz transformations is simple: a Lorentz transformation R followed 
by a Lorentz transformation S is just the product SR of the corresponding complex quaternions. This is 
the physics convention, where rotations accumulate to the left. In the OpenGL convention, where rotations 
accumulate to the right, R followed by S is RS. 

The inverse of a Lorentz rotor R is its quaternionic conjugate R. 

Any even multivector q is equivalent to a complex quaternion by the isomorphism (14.33). According to 
the usual transformation law (13.56) for multivectors, the rule for Lorentz transforming an even multivector 


qis 


R: q—> RqR| (even multivector) . (14.61) 


The transformation (14.61) instructs to multiply three complex quaternions R, q, and R, a one-line expression 
in a c++ program. In OpenGL convention, the transformation rule is q > RqR. 
As an example of an even multivector, the electromagnetic field F is a bivector in the spacetime algebra, 


F = gE an An ’ (14.62) 


the factor of $ compensating for the double-counting over indices m and n (the $ could be omitted if the 
counting were over distinct bivector indices only). The imaginary and real parts of F constitute the electric 
and magnetic bivectors E = Eata and B = Bata 


F = -I(E + IB). (14.63) 


Under the parity transformation P (14.50), the electric field E changes sign, whereas the magnetic field B 
does not, which is as it should be: 


P: E>-E, BoB. (14.64) 


364 The spacetime algebra 


In view of the isomorphism (14.33), the electromagnetic field bivector F can be written as the complex 


quaternion 
0 B Bo B3 
F= ; 14. 
{ 0 -E -E -Es ai 


According to the rule (14.61), the electromagnetic field bivector F Lorentz transforms as F + RFR, which 
is a powerful and elegant way to Lorentz transform the electromagnetic field. 

A 4-vector a = Yma™ is a multivector of grade 1 in the spacetime algebra. A general odd multivector in 
the spacetime algebra is the sum of a vector (grade 1) part a and a pseudovector (grade 3) part Ib = Iymb”. 
The odd multivector can be written as the product of the time basis vector yo and an even multivector q 


a+ Ib = yoq = (a? + Iraa® — Ib? + tab") . (14.66) 


By the isomorphism (14.33), the even multivector q is equivalent to the complex quaternion 


a® b bB b 
a={ ft al a2 a I (14.67) 


According to the usual transformation law (13.56) for multivectors, the rule for Lorentz transforming the 
odd multivector Yoq is 


R: ya > RyqR=yR*GR . (14.68) 


In the last expression of (14.68), the factor yọ has been brought to the left, to be consistent with the 
convention (14.66) that an odd multivector is yo on the left times an even multivector on the right. Notice 
that commuting Yo through R converts the latter to its complex conjugate R* (with respect to I), which is 
true because yọ commutes with the quaternionic imaginaries 2,, but anticommutes with the pseudoscalar I. 
Thus if the components of an odd multivector are stored as a complex quaternion (14.67), then that complex 
quaternion q Lorentz transforms as 


R: q— R*qR| (odd multivector) . (14.69) 


In OpenGL convention, q > R'gR. The rule (14.69) again instructs to multiply three complex quaternions 
R*, q, and R, a one-line expression in a c++ program. The transformation rule (14.69) for an odd multivector 
encoded as a complex quaternion differs from that (14.61) for an even multivector in that the first factor R 
is complex conjugated (with respect to T). 

A vector a differs from a pseudovector Ib in that the vector a changes sign under a parity transforma- 
tion P whereas the pseudovector Ib does not. However, the behaviour of a pseudovector under a normal 
Lorentz transformation (which preserves parity) is identical to that of a vector. Thus in practical situations 
two 4-vectors a and b can be encoded into a single complex quaternion (14.67), and Lorentz transformed 
simultaneously, enabling two transformations to be done for the price of one. 

Finally, a Dirac spinor ~ is equivalent to a complex quaternion q (§14.9). It Lorentz transforms as 


R: q— Rq| (spinor). (14.70) 
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In OpenGL convention, where rotations accumulate to the right instead of left, q > qR. 


Exercise 14.6. Interpolate a Lorentz transformation. Argue that the interpolating Lorentz rotor R(t) 
that corresponds to uniform rotation and acceleration between initial and final Lorentz rotors Ro and R, as 
the parameter t varies uniformly from 0 to 1 is 


R(t) = Ro exp [tIn(Ri/Ro)] . (14.71) 


Exercise 14.7. Exponential and logarithm of a complex quaternion. What are the (1) exponential 
and (2) logarithm of a complex quaternion in terms of its components? Address the issue of the multi-valued 
character of the logarithm. 
Solution. 
1. Exponential of a complex quaternion. Decompose the complex quaternion p into the sum of a 
complex number w and a complex bivector n0 of complex modulus 0 and unimodular complex direction 
n (satisfying nn = 1). Then 
eP = eYtne — eY (cos9 + nsin8) . (14.72) 


2. Logarithm of a complex quaternion. Essentially, reverse the procedure for exponentiation. Denote 
the logarithm of the complex quaternion q by lnq = p = w + n0. The non-quaternionic part of the 
logarithm is the complex number w given by the (complex) logarithm of the (complex) modulus of q, 


w = 4 ln(qq) . (14.73) 
The complex quaternion q scaled to unit modulus is then 


a =cosd+nsind , (14.74) 
whose non-quaternionic part cos@ defines the (complex) angle 6, and whose quaternionic part nsin 9, 
when divided by sin 0, yields the unimodular complex quaternion n. The complex logarithm w is as usual 
ambiguous by additive multiples of 27/, while the complex argument 6 of the cos and sin is ambiguous 
by additive multiples of 27. But in addition there is (a) an ambiguity of a choice of sign between n 
and sin 0, and (b) an ambiguity of a choice of sign between e” and the sign of cos 0 + n sin 0. The first 
ambiguity may be resolved by fixing the real part of 6 to lie in the interval [0, 7). The second ambiguity 
may be resolved by fixing the real part of e” to be positive, achieved by setting the imaginary part of 
w to lie in the interval (—2/2, 7/2]. For rotors, which are unimodular by definition, e” = 1 and w = 0. 


Exercise 14.8. Spline a Lorentz transformation. A spline is a polynomial that interpolates between two 
points with given values and derivatives at the two points. Confirm that the cubic spline of a real function 
f(x) with given initial and final values fọ and fı and given initial and final derivatives f and fi at x = 0 
and x = 1 is 


F(x) = fot for + BU — fo) = 2fo — file? + 2(fo — fu) + fot fil? - (14.75) 


The case in which the derivatives at the endpoints are set to zero, fj = fi = 0, is called the “natural” spline. 
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Argue that a Lorentz rotor can be splined by splining the quaternionic components of the logarithm of the 
Lorentz rotor. 


Exercise 14.9. The wrong way to implement a Lorentz transformation. The principal purpose of 
this exercise is to persuade you that Lorentz transforming a 4-vector by the rule (14.69) is a much better 
idea than Lorentz transforming by multiplying by an explicit 4 x 4 matrix. Suppose that the Lorentz rotor 
R is the complex quaternion 


n={ S Se ee AR \ (14.76) 
Wr I YI ZI 


Show that the Lorentz transformation (14.69) transforms the 4-vector a” Ym = {a°Yo, atqı, a7, a93} as 
(note that the 4 x 4 rotation matrix is written to the left of the 4-vector in accordance with the physics 
convention that rotations accumulate to the left): 


a? Yo lw]? + lz]? + |yl? + lz)? 2(— wre + WIER +YRZI — YIZR) 
1 , i 2 2 2 2 
R: an |, 2 (— wrt; + WIER — YRZI + YIR) [w| + |æ" — |yl” — Izl 
ays 2(— WRYI + WIYR — ZRII + ZIR) 2(TRYR + Tryr + WrZR + W721) 
a93 2(— WRZI + WIZR — TRYI + TIYR) 2(ZRER + ZITI — WRYR — wryr) 
2 (— WRYI + WIYR + ZRTI — ZI®£R) 2(—WRZI + WIZR + TRYI — TIYR) ayo 
2 (ERYR + TIYI — WRZR — W121) 2 (ZRER + ZI£r + WRYR + wry) ayy 
2 2 2 2 3 , (14.77) 
lau)" = jal” + [yl” — lz] 2 (YRZR + YIz1 — WRLR — Wr) a y 
2 2 2 2 
2 (YRZR + YrZı + WRR + Wrxr) [w| fal gl + || a? 


where || signifies the absolute value of a complex number, as in lw]? = w% + w?. As a simple example, show 
that the transformation (14.77) in the case of a Lorentz boost by velocity v along the 1-axis, where the rotor 
R takes the form (14.43), is 


a? Yo y yw 0 0 a? Yo 
ie | ea Oe on (14.78) 
f arya 0 0 10 ayo i f 
ay 0 0 0 1 a43 
with y the familiar Lorentz gamma factor 
1 . v 
y = cosh 0 = yu = sinh 0 = (14.79) 


vVI=v2 vVI=v 
Exercise 14.10. Transform a 4-vector into a desired frame. Find Lorentz boosts that transform 
respectively (1) a timelike 4-vector a* to point along the 0-axis, and (2) a null 4-vector a* to point along the 
0-1 null axis. Find a spatial rotation that transforms (3) a 4-vector {a°,a%} so that its spatial part points 
along the 1-axis, leaving the time component a? unchanged. 

Solution. 
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1. Lorentz boost of a timelike 4-vector. Let a = +\/—a,a* be the magnitude of the timelike 4-vector 
af, with sign chosen to be that of a°. The Lorentz boost 


1 
{WR, TI, Yr, 21} = lag a (14.80) 


transforms a* to {a, 0,0, 0}. 
2. Lorentz boost of a null 4-vector. Choose a to be a non-zero real number with sign equal to that of 
a°. The Lorentz boost 


0ta, a Fa, a°, a°} (14.81) 


1 
{wr, £1, Yr, ZI} = 2a(a? + n" 


transforms a* to {a, +a, 0, 0}. 
3. Spatial rotation of a 4-vector. Let a = yaaa% be the spatial magnitude of the spatial 4-vector 
a’ = {a}, a°}. The spatial rotation 


1 
{WR, TR, YR, ZR} = aT a (14.82) 


transforms a* to {a?, a, 0,0}, leaving the time component a? unchanged. 


14.6 Killing vector fields of Minkowski space 


The geometry of Minkowski space is unchanged under two continuous groups of symmetries, the 4-dimensional 
group of translations, and the 6-dimensional group of Lorentz transformations. A symmetry transformation is 
a transformation of the coordinates that, with a suitable choice of coordinates, leaves the metric unchanged. 
Independent of the choice of coordinates, a symmetry transformation is a transformation that leaves the 
proper spacetime distance between any two points unchanged. 

Any infinitesimal symmetry transformation defines a Killing vector é”, §7.32, which shifts the coordinates 
by an infinitesimal amount, 


ah > oh eE", (14.83) 


with e an infinitesimal real number. The infinitesimal transformation defines a flow field, called a Killing 
vector field, in the spacetime. The basic Killing vector fields of Minkowski space have been met earlier in 
this book. The Killing field associated with a translation is a set of parallel straight lines (timelike, null, or 
spacelike) in Minkowski space. The Killing field associated with a spatial rotation is a set of nested spacelike 
circles about a spatial axis, Figure 1.13. The Killing field associated with a pure Lorentz boost is a set of 
nested timelike, null, and spacelike hyperbolae, Figure 1.14. 

The most general Killing vector field of Minkowski space is a linear combination of translational and Lorentz 
Killing vectors with constant coefficients. The Killing field associated with a pure Lorentz transformation 
(no translational component) always has at least one fixed point, the origin, which is unchanged by the 
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Lorentz transformation. The addition of a translational component corresponds to uniform translational 
motion (possibly superluminal) of the origin of the Lorentz transformation. In some cases the composition 
of a translation and a Lorentz transformation simplifies to a Lorentz transformation. For example, a Lorentz 
transformation (either a spatial rotation or a Lorentz boost) in a given 2-dimensional plane, coupled with a 
translation in the same plane, always has a fixed point, and is equivalent to another Lorentz transformation 
in the same plane with origin at the fixed point. 

The remainder of this section considers the Killing field of a pure Lorentz transformation (no translational 
component). The Killing vector associated with a Lorentz transformation is its generator, which is a bivector, 
or equivalently complex quaternion, 0 = Or + [0;. The real part Or of the bivector is the generator of a 
spatial rotation, while the imaginary part Or is the generator of a Lorentz boost. The decomposition of the 
bivector into real and imaginary parts is analogous to the decomposition of the electromagnetic field into 
magnetic and electric parts, equation (14.63). The complex modulus squared |6|? of the bivector, 


|0|? = 00 = —0? = 67 — 67, — 210R - 6, , (14.84) 


is invariant under Lorentz transformations. By a suitable Lorentz transformation, the bivector @ may be 
adjusted arbitrarily, subject only to the condition that its complex modulus is fixed, that is, 67 — OF, and 
Opr: 0r are constant. 

If the bivector is non-null, |0| 4 0, then by a suitable Lorentz transformation the real (magnetic) OR 
and imaginary (electric) 6; parts can be taken to be parallel, directed along a common unimodular spatial 
direction, n, say. So transformed, the bivector @ is the complex quaternion 


The bivector (14.85) generates a uniform proper spatial rotation about the n axis, coupled with a uniform 
proper acceleration along the n axis. A Killing trajectory #(\) = 2” (A) Ym, parametrized by affine parameter 
A along the trajectory, is obtained by Lorentz transforming an initial 4-vector £o = æ(0) by a rotor R = 
e 9/2, equation (14.35), 


Define Killing coordinates a and ¢ by 
a=0,, b=DAbR. (14.87) 


If the unimodular direction n is taken to be the z-direction, then Minkowski coordinates 2” = {t, x,y,z} 
along a Killing trajectory (14.86) starting at xj’ = {0,1,7r, 0} are 


{t, x, y, z} ={lsinha, lcosha, rcos¢, rsingd}. (14.88) 


The Killing trajectory (14.88) is arranged, without loss of generality, such that it is initially at rest in the 
parallel x-direction, and moving with some initial velocity v, in the perpendicular z-direction, 


_ dz _ rdo 


a Mare (14.89) 


UL 
init 
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Figure 14.2 3D spacetime diagram of a sample of (blue) timelike Killing trajectories in Minkowski space. The two 
outermost of the trajectories shown lie on the light cylinder, and are lightlike. Motion in the z spatial direction is 
suppressed. The trajectories accelerate with uniform proper linear acceleration in the x-direction, and with uniform 
rotation in the y-z plane. The trajectories shown are for the case of a Killing vector with equal acceleration and 
rotational components, |07| = |@r| (corresponding to the motion of charges in equal electric and magnetic fields, 
|E| = |B|, Exercise 14.11). The crossing (purple) lines are spacelike lines of constant affine parameter À. 


A trajectory is timelike provided that 
lui[<1. (14.90) 


Null trajectories, with |v,| = 1, define the light cylinder. Killing trajectories outside the light cylinder are 
spacelike. The metric with respect to Killing coordinates {a, ¢} and comoving coordinates {1,r} is 


ds? = — lda? + dl? + dr? + r?d¢? . (14.91) 


The proper time along a Killing trajectory dl = dr = 0 is 


dr = yda? — r2d¢? = 1|67|4/1 — v? dà = r|Or v —1dd. (14.92) 


The condition that A be an affine parameter, dà = dr/m, implies that the lengths l and r are related to 6; 
and OR by 
pe gs PAI 
|r|’ On|” 


(14.93) 
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where yı = 1/,\/1— v7 is the Lorentz gamma-factor corresponding to the velocity vı. The 4-momentum 
p = dæz/dà and 4-acceleration k = dp/d, along the Killing trajectory are 


p=RpR, k= RKR, (14.94) 
with initial values 
dx 1 dp 
Po dÀ i Al , £o] » Ko dÀ n zl , Po] ( ) 


Figure 14.2 illustrates a sample of Killing trajectories for the case of equal boost and rotational components, 
|r| = |@rl- 

The above was for the case where the generating bivector 0 of the symmetry transformation was non-null. 
Alternatively, the generating bivector may be null, 00 = 0. In this case the real and imaginary parts of the 
bivector are orthogonal, Or - Or = 0, and their magnitudes are equal, |r| = |07|, equation (14.84). A null 
bivector is also nilpotent, 0? = 0, so the rotor R obtained by exponentiating @ is linear in 0, 


R= e°? — 1 — 0/2. (14.96) 
A Killing trajectory æ starting from an initial 4-vector £o is 
— A 

x = RzoR = zo — 5 (8, £o] , (14.97) 


which is a straight line passing through ao. It can be checked that the line may be spacelike or null, but 
never timelike. It is not clear whether this is a useful result. 


Exercise 14.11. Motion of a charged particle in uniform parallel electric and magnetic fields. 
Calculate the trajectory in Minkowski space of a particle of mass m and charge q in an electromagnetic field 
where the electric and magnetic fields are uniform and parallel, E = En and B = Bn (Landau and Lifshitz, 
1975, §22, Problem 1). 

Solution. As long as the electromagnetic field F = B — IE, equation (14.63), is non-null, |F| 4 0, the 
electric and magnetic fields can be made parallel by a suitable Lorentz transformation. The electric and 
magnetic fields are unchanged by a complex (with respect to J) Lorentz transformation along the common 
direction n, that is, by a combination of a spatial rotation about n and a Lorentz boost along n. Thus the 
symmetry of Minkowski space under Lorentz transformations along n is preserved by the introduction of 
uniform electric and magnetic fields along n. The trajectories of charged particles are Killing trajectories of 
Lorentz transformations along the direction n. The equation of motion (4.44), 


implies that the Killing bivector is 0 = —qF, or equivalently 


Or = qE ; OR = —qB . (14.99) 


14.7 Dirac matrices 371 
14.7 Dirac matrices 


The multiplication rules (14.1) for the basis vectors Ym of the spacetime algebra are identical to the 
rules (14.5) governing the Clifford algebra of the Dirac 7-matrices used in the Dirac theory of relativistic 
spin-$ particles. 

The Dirac 7-matrices are conventionally represented by 4 x 4 complex (with respect to the quantum- 
mechanical imaginary i) unitary matrices. The matrices act on 4-component complex (with respect to i) 
Dirac spinors, which are spin- $ particles, §14.8. Four complex components are precisely what is needed to 
represent a complex quaternion, or Dirac spinor, §14.9. 

An essential feature of a successful theory of relativistic spinors is the existence of an inner product 
of spinors, necessary to allow a scalar probability to be defined. The systematic construction of a scalar 
product of spinors is deferred to Chapter 39, §39.5. Meanwhile, in the traditional Dirac approach, a spinor 
w is represented as a column vector with 4 complex (with respect to i) components, while its Hermitian 
conjugate yt is a row vector with 4 complex components that are the complex conjugates (with respect to i) 
of the components of y. The product Yt is a real number, but is not satisfactory as a scalar product since it 
is not Lorentz invariant. Rather, yty proves to be the time component n° of a 4-vector number current n”, 
which the Dirac equation then shows to be covariantly conserved, Dyn = 0, equation (41.20). The number 
current n* is interpreted as a conserved probability current. The requirement that the Dirac number current 
density n° be positive imposes the condition, equation (39.101), that taking the Hermitian conjugate of any 
of the basis vectors ym be equivalent to raising its index, 


yl gy”. (14.100) 


Condition (14.100) is the same as requiring that the basis vectors be unitary matrices, yp! = y}. 

The high-energy physics community commonly adopts the +——— metric signature, which is opposite to 
the convention adopted here. With the high-energy +——-— signature, the traditional Dirac representation 
of unitary y-matrices satisfying the scalar product condition (14.5) is 


1 0 0 oa 
Yo = ( 0 ~i ) $ Ya = ( Ca 0 ) ? (14.101) 


where 1 denotes the unit 2 x 2 matrix, and o, denote the three 2 x 2 Pauli matrices (13.112). The choice of 
‘Yo as a diagonal matrix is motivated by Dirac’s discovery that eigenvectors of the time basis vector yo with 
eigenvalues of opposite sign define particles and antiparticles in their rest frames (see §14.8). 

With the —++-+ metric signature adopted here, the Dirac representation of the 7-matrices can be 


taken to be 
_f1 0 0 oa 
wail 4 a f n 0 ) f (14.102) 


The representation (14.102) has the advantage that the resulting chiral basis vectors are all real, equa- 
tions (39.15). In the representation (14.102), the bivectors Ca and Io, and the pseudoscalar I of the space- 
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time algebra are 


Vola = Fa =i( i a ) » FEabcYoYe = ITa =i( i a ) , I= ( : = ) . (14.103) 
The Hermitian conjugates of the bivector and pseudoscalar basis elements are 
ol =0a, (Ioa) =-Ioa, I'=-I. (14.104) 
The conventional chiral matrix ys of Dirac theory is defined to be —i times the pseudoscalar, 
Ys = —iPNNTY = il = ( : ; ) . (14.105) 
The chiral matrix ys is Hermitian (yi = y5) and unitary (y;1 = vt), so its square is the unit matrix, 
Sey. OB SL: (14.106) 


14.8 Dirac spinors 


A Dirac spinor is a spin-4 particle in Dirac’s theory of relativistic spin-4 particles. A Dirac spinor w is a 
complex (with respect to the quantum-mechanical imaginary i) linear combination of 4 basis spinors €a with 
indices a running over {ff, t4, 4, I}, a total of 8 degrees of freedom in all, 


b= yea. (14.107) 


The basis spinors €, are basis elements of a super spacetime algebra, Chapter 39. In the Dirac representa- 
tion (14.102), the four basis spinors are the column spinors 


1 0 0 0 
0 1 0 0 

m=] qo [> |o lo wz] |> =] o (14.108) 
0 0 0 1 


The Dirac y-matrices operate by pre-multiplication on Dirac spinors w~, yielding other Dirac spinors. The 
basis spinors are eigenvectors of the time basis vector yo and of the bivector [o3, with ep and ey denoting 
eigenvectors of yo, and e} and e, eigenvectors of 103, 


Yo ep = TER, Yo €y = —2€y , Loz3e,=1e& , lo3€, =—te,. (14.109) 


The bivector Iø} is the generator of a spatial rotation about the 3-axis (z-axis), equation (14.32). Simulta- 
neous eigenvectors of yo and Io3 exist because yo and Ja3 commute. 

A pure spin-up Dirac spinor €} can be rotated into a pure spin-down spinor e€}, or vice versa, by a spatial 
rotation about the l-axis or 2-axis. By contrast, a pure time-up spinor € cannot be rotated into a pure 
time-down spinor €y, or vice versa, by any Lorentz transformation. Consider for example trying to rotate 
the pure time-up spin-up €+ spinor into any combination of pure time-down ey spinors. According to the 
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expression (14.122), the Dirac spinor 7 obtained by Lorentz transforming the €p spinor is pure ey only if the 
corresponding complex quaternion q is pure imaginary (with respect to J). But a pure imaginary quaternion 
has negative squared modulus gq, so cannot be equivalent to any unimodular rotor. 

Thus the pure time-up and pure time-down spinors €y and ey are distinct spinors that cannot be trans- 
formed into each other by any Lorentz transformation. The two spinors represent distinct species, particles 
and antiparticles (see §14.10). 

Although a pure time-up spinor cannot be transformed into a pure time-down spinor or vice versa by any 
Lorentz transformation, the time-up and time-down spinors €q and ey do mix under Lorentz transformations. 
The manner in which Dirac spinors transform is described in §14.9. 

The choice of time-axis yp and spin-axis y3 with respect to which the eigenvectors are defined can of course 
be adjusted arbitrarily by a Lorentz boost and a spatial rotation. The eigenvectors of a particular time-axis 
‘Yo correspond to either particles or antiparticles that are at rest in that frame. The eigenvectors associated 
with a particular spin-axis y3 correspond to particles or antiparticles that are either pure spin-up or pure 
spin-down in that frame. 


14.9 Dirac spinors as complex quaternions 


In §13.15 it was found that a spin-4 object in 3D space, a Pauli spinor, is isomorphic to a real quaternion, or 
equivalently scaled 3D rotor, equation (13.122). In the relativistic theory, the corresponding spin-4 object, 
a Dirac spinor w, is isomorphic (14.113) to a complex quaternion. The 4 complex degrees of freedom of the 
Dirac spinor w are equivalent to the 8 degrees of freedom of a complex quaternion. A physically interesting 
complication arises in the relativistic case because a non-trivial Dirac spinor can be null, isomorphic to a 
null complex quaternion, whereas any non-trivial Pauli spinor is necessarily non-null. The cases of non-null 
(massive) and null (massless) Dirac spinors are considered respectively in §14.10 and §14.11. If the Dirac 
spinor is non-null, then it is equivalent to a scaled rotor, equation (14.140), but if the Dirac spinor is null, 
then it is not simply a scaled rotor. The present section establishes an isomorphism (14.113) between Dirac 
spinors and complex quaternions that is valid in general, regardless of whether the Dirac spinor is null or 
not. 

If a is a spacetime multivector, equivalent to an element of the Clifford algebra of Dirac y-matrices, then 
under rotation by Lorentz rotor R, the multivector a operating on the Dirac spinor w transforms as 


R: ay > (RaR)(Ry) = Ray . (14.110) 
This shows that a Dirac spinor w Lorentz transforms, by construction, as 
R: y> Ry. (14.111) 


The rule (14.111) is precisely the transformation rule (13.75) for spacetime rotors under Lorentz transfor- 
mations: under a rotation by rotor R, a rotor S transforms as S —> RS. More generally, the transformation 
law (14.111) holds for any linear combination of Dirac spinors w. The isomorphism (14.34) between spacetime 
rotors and unimodular quaternions, coupled with linearity, shows that the vector space of Dirac spinors is 
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isomorphic to the vector space of complex quaternions. Specifically, any Dirac spinor w can be expressed 
uniquely in the form of a 4 x 4 matrix q, the Dirac representation of a complex quaternion q, acting on the 
time-up spin-up column vector €p (the precise translation between Dirac spinors and complex quaternions 
is left as Exercises 14.12 and 14.13): 


Y = dept - (14.112) 


In this section (including the Exercises) the 4 x 4 matrix q is written in boldface to distinguish it from the 
quaternion q that it represents; but the distinction is not fundamental. The mapping (14.112) establishes an 
isomorphism between the vector spaces of Dirac spinors and quaternions 


po. (14.113) 


The isomorphism means that there is a one-to-one correspondence between Dirac spinors w and complex 
quaternions q, and that they transform in the same way under Lorentz transformations. 

The isomorphism between the vector spaces of Dirac spinors and complex quaternions does not extend to 
multiplication; that is, the product of two Dirac spinors 71 and Yz equivalent to the complex 4 x 4 matrices 
qi and q2 does not equal the Dirac spinor equivalent to the product qı1q2. The problem is that the Dirac 
representation of a Dirac spinor w is a column vector, and two column vectors cannot be multiplied. The 
question of how to multiply Dirac spinors is resumed in Chapter 39 on the super spacetime algebra. 


14.9.1 Reverse Dirac spinor 


An essential feature of any viable theory of spinors is the existence of a scalar product of spinors. The scalar 
product must be a complex (with respect to the quantum mechanical imaginary i) number that is invariant 
under Lorentz transformations. Now the product gq of the reverse of a quaternion with itself is a Lorentz- 
invariant complex (with respect to J) number. This suggests defining a row Dirac spinor ~ reverse to the 
column Dirac spinor w defined by equation (14.112) by 


Y= emt, (14.114) 


where q is the matrix representation of the reverse g of the complex quaternion q, and el denotes the basis of 
row Dirac spinors obtained by transposing the basis of column Dirac spinors defined by equations (14.108), 


e~=(i 000), ey=(0 100), e,=(0 01 0), eh =(0 0 0 1). 
(14.115) 
The reverse Dirac spinor w is also called the Dirac adjoint spinor. It is related to the Hermitian conjugate 
Dirac spinor ~' by equation (14.130), and is the same as the Dirac row conjugate spinor ~- discussed in 
Chapter 39, equation (39.99). 

As found in equation (14.125a), py is a Lorentz-invariant real number. More generally, the product yw 
of a row spinor X with a column spinor ~ is a Lorentz-invariant complex (with respect to i) number, and 
therefore provides a viable definition of a scalar product of Dirac spinors. The problem of defining a scalar 
product of Dirac spinors is resumed in §39.5.1. 
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Exercise 14.12. Translate a Dirac spinor into a complex quaternion. Given any Dirac spinor in the 
Dirac representation (14.102), 


dagel 2 (14.116) 


show that the corresponding complex quaternion q, and the equivalent 4 x 4 matrix q such that Y = q e+, are 
(the complex conjugates w* of the components w of the spinor are with respect to the quantum-mechanical 
imaginary i) 
yt —ayyt —y't yl 
Re yt Im ym —Rewt Im pit yt yi —yt —yylt 
T=) Reytt myt -Reyh impt (77 9= | git yh git pth (14.117) 
yt yt yt yi 
Show that the reverse complex quaternion q and the equivalent 4 x 4 matrix q in the Dirac representa- 
tion (14.102), are 
yit opts ytt —yyl 
__ f Re ypa? —Im ym Re vit —Imytt = yN yi yt —ylt 
I=) Reytt —Imdt) Rew -Immy [ÈIS git yhe yfte y 
-yH yt y yit 


Conclude that the reverse spinor ~ defined by equation (14.114) is 
C=enga=( yt yh yl ph). (14.119) 


Exercise 14.13. Translate a complex quaternion into a Dirac spinor. Show that the complex quater- 
nion q = w +1% + Jy + kz is equivalent in the Dirac representation (14.102) to the 4 x 4 matrix q 


(14.118) 


WR HZR iWwRt+yR wr — izg i£I — YI 
gaf wR te yr ar | g_| @a-yR wr= izm —isrtyr wr+iz | Ai 
wr TI YI ZI wr + izr wy + YI WR+1ZR IZR+YR 


i£I — YI wr — izg iZR — YR WR—1ZR 


Show that the reverse quaternion q, the complex conjugate (with respect to J) quaternion q*, and the reverse 
complex conjugate (with respect to I) quaternion @ are respectively equivalent to the 4 x 4 matrices 


qT e q =-wa'n , (14.121a) 
go T = -yq , (14.121b) 


T e ql =- , (14.121c) 
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where yo is the Dirac 7-matrix given by equation (14.102). Conclude that the Dirac spinor Y = qep+ 
corresponding to the complex quaternion q is 


WR HIZR 
Y= qen = a ; (14.122) 
itr — YT 
that the reverse spinor 7, equation (14.114), is 
v= en d= ( WR—İZR ixR — YR wr tizy ip+yr ) ; (14.123 
and that the Hermitian conjugate spinor 7" is 
yl = em q= ( wr—izr —izrR—yr wr—-izy —iztr—yī ) i (14.124 
Hence conclude that Yy and yty are 
Yy = Re(q) = FRIR — T141 , (14.125a 
vib = RIR + rar , (14.125b 
with 
Irar = Wh + ER HYR HZR, (14.126a 
qd =w +r +y tz. (14.126b 


Exercise 14.14. Pseudoscalar times a Dirac spinor. In §14.10 it will be found that multiplying a Dirac 
spinor w by the pseudoscalar J converts it to an antispinor. In Chapter 39, equation (39.134), it will be seen 
that Iw is the PT conjugate of I, the spinor obtained by reversing all 4 axes of space and time. Show that 
the product Iq of the pseudoscalar J with the complex quaternion q = w +21% + Jy + kz is equivalent in the 
Dirac representation (14.102) to the 4 x 4 matrix Iq 


wr — izr i£I — YI WR —îiZR iZR — YR 
n=f WI TI YI ZI \ Sige ae tyr TUTA Titr + YR TWR FizR 
WR TR YR ZR WR FIZR REYR wr — 42] wT — YI 
iztR— YR WR-ZzR —itr +y; —Wwr+izı 
(14.127) 
Conclude that the Dirac antispinor Ty = Iq ep corresponding to the complex quaternion Iq is 
— wr — izy 
m=i TA (14.128) 
WR IZR 
iR — YR 


Conclude that the pseudomagnitude YI is 


ply = —Im(qq) = —2(wrWr + ERTI + YRYI + ZRZI) - (14.129) 
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Exercise 14.15. Relation between Y and yt. Show that 7 and yt are related by 


P=—ipty, yt =i , (14.130) 
by showing from equations (14.123) and (14.124) that 

pt = iets To - (14.131) 
The same result follows from equation (14.121b). The Hermitian conjugate matrix is q? = —yoq7yo, and 


Ent Yo = ieit so yi = Em q = —i€ fs qayo- 
Exercise 14.16. Translate a Dirac spinor into a pair of Pauli spinors. Show that in terms of the 
real and imaginary (with respect to I) parts of the complex quaternion q, the equivalent 4 x 4 matrix q is 


qd=qrt+lqaeq= ( aR MH ) j (14.132) 
qi qR 

where qpr and qz are the complex 2 x 2 special unitary matrices equivalent to the real quaternions qpr and qr, 

equation (13.130). Show that the reverse quaternion q, the complex conjugate (with respect to I) quaternion 

q*, and the reverse complex conjugate (with respect to J) quaternion g* are respectively 


a | 
qeg- R U), (14.133a) 
dr dR 
xo A+ (RU ) 
qeq , 14.133b 
( ~qr WR ( ) 
qr q 
Te ga] 3R 4J. (14.133c) 
—-qd; qR 
Conclude that the Dirac spinor % = q €y+ corresponding to the complex quaternion q is 
W=qen = ( VR ) ; (14.134) 
VI 


where Wr and Yz are the Pauli spinors corresponding to the real quaternions qpr and qz, equation (13.131). 
Conclude further that the antispinor I is 


Iy = Iq ent = ( Pi ) ; (14.135) 


that the reverse spinor 7, equation (14.114), is 
veel.q= ( i —ut ) (14.136) 
and that the Hermitian conjugate spinor yt is 


ot sel. = ( vt, yt ) (14.137) 
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Hence conclude that Yy, wIy, and yty are given by 


by = bhobr — vid , (14.138a) 
oly = —(Whdr + vier) , (14.138b) 
vw = hyr + vd . (14.138c) 


Exercise 14.17. Is the group of Lorentz rotors isomorphic to SU(4)? Previously, Exercise 13.10, 
it was found that the group Spin(3) of spatial rotors in 3 dimensions is isomorphic to SU(2). Is the group 
Spin(3,1) of Lorentz rotors isomorphic to the group SU(4) of complex 4 x 4 unitary matrices with unit 
determinant? 

Solution. No. The Dirac representation of the group Spin(3,1) of Lorentz rotors shares with SU(4) the 
property that its matrices are complex 4 x 4 matrices with unit determinant. From the equivalence (14.120), 
the determinant of the 4 x 4 complex matrix q equivalent to a complex quaternion q is 


det q = (Gq)* (Gq) - (14.139) 


Since a Lorentz rotor is unimodular, with gq = 1, its Dirac representation has unit determinant. However, 
the Dirac representation of a Lorentz rotor is not unitary (its inverse is not its Hermitian conjugate), despite 
the fact that all the generators of the group, namely the 6 bivectors o, and Ioa, are unitary. Rather, the 
inverse of a rotor R is its reverse R, related to its Hermitian conjugate by equation (14.121a). The condition 
for the matrices of a group to be unitary is that the generators be skew-Hermitian (they equal minus their 
Hermitian conjugates). The 3 spatial generators Io, are indeed skew-Hermitian, but the 3 boost generators 
Oa are Hermitian. 


14.10 Non-null Dirac spinor 


A non-null, or massive, Dirac spinor w is one that is isomorphic (14.113) to a non-null complex quaternion q. 
A non-null complex quaternion can be factored as a non-zero complex (with respect to T) scalar A = Ar +Izr 
times a unimodular complex (with respect to I) quaternion R, a Lorentz rotor. Thus a non-null Dirac spinor 
can be expressed as, equation (14.112) (the boldface for q, adopted in §14.9 to distinguish a quaternion q 
from its matrix representation q, is dropped henceforth, since the distinction is not fundamental), 


w=qer, G=AR. (14.140) 


The complex scalar A can be taken without loss of generality to lie in the right hemisphere of the complex 
plane (positive real part), since a minus sign can be absorbed into a spatial rotation by 27 of the rotor 
R. There is no further ambiguity in the decomposition (14.140) into scalar and rotor, because the squared 
modulus ARAR = à? of the scaled rotor AR is the same for any decomposition (do not confuse reversion 
with complex conjugation; the reverse of a scalar is itself, A = A; the product A? is a complex (with respect 
to I) number). 

The fact that a non-null Dirac spinor ~ encodes a Lorentz rotor shows that a non-null Dirac spinor in 
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some sense “knows” about the Lorentz structure of spacetime. It is profound that the Lorentz structure of 
spacetime is built in to a non-null Dirac particle. 

As discussed in §14.8, a pure time-up eigenvector €, represents a particle in its own rest frame, while a pure 
time-down eigenvector €y represents an antiparticle in its own rest frame. The time-up spin-up eigenvector 
Ent is by definition (14.140) equivalent to the unit scaled rotor, AR = 1, so in this case the scalar A is 
pure real. Lorentz transforming the eigenvector multiplies it by a rotor, but leaves the scalar A unchanged, 
therefore pure real. Conversely, if the time-up spin-up eigenvector €pp is multiplied by the imaginary J, then 
according to the expression (14.122) the resulting spinor can be Lorentz transformed into a pure €y spinor, 
corresponding to a pure antiparticle. Thus one may conclude that the real and imaginary parts (with respect 
to I) of the complex scalar A = Ag + IX; correspond respectively to particles and antiparticles. 

The Lorentz-invariant decomposition of a non-null Dirac spinor ~ into its particle Y} and antiparticle wy 
parts is accomplished by 


Re A Im A = = 
baUgtidy, mE, wecd, A= yo IOI). (14.141) 


The decomposition (14.141) is not the same as the decomposition (14.134) of the Dirac spinor into a pair of 
Pauli spinors. The decomposition (14.141) into particle and antiparticle parts is Lorentz-invariant, whereas 


the Pauli spinors of the decomposition (14.134) mix under Lorentz boosts. The Lorentz-invariant magnitude 
ww of the Dirac spinor, equation (14.125a), is the difference between the probabilities A}, of particles and A? 
of antiparticles, 


ww = Dyn = dyvy ) Dyn = ae ; ypy = A? i (14.142) 


Thus wy is positive for particles, negative for antiparticles. The Lorentz-invariant pseudomagnitude YI, 
equation (14.129), is minus twice the product AA; of the amplitudes of particles and antiparticles, 


Oly =- rty — Pyro, Vrtu = yds = ARAL - (14.143) 


The sum of the probabilities AR of particles and à? of antiparticles equals the number density in the rest 
frame, which can be written in the manifestly Lorentz-invariant form 


Dibr + PyPy = Ar +AT = y GI"V) (Pm) - (14.144) 


Since Ag and Az are invariant under Lorentz transformations, all three terms Vrs wy, and avy = Py 
are Lorentz-invariant scalars. 


Concept question 14.18. Is yọ real or complex? If = A€y+ is a Dirac spinor corresponding to a 
complex quaternion À = Àg + IA; with no quaternionic part (so À = A), should it not be that 


ww = Eh AA Ent = Ei ? Ett = A? 5 (14.145) 


which is a complex number, not a real number? Answer. No. Do not confuse the quantum-mechanical 
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imaginary i with the pseudoscalar J. The combination Ett I eņņ does not equal I EMENT- The complex (with 
respect to I) number 4, and its matrix representation A are, equation (14.120), 


A= Ant Dy & A=An( g irai a (14.146) 
where the 1’s in the matrices on the right hand side denote the 2 x 2 unit matrix. The product ww is 
AR 
py=( àr 0 ià 0) H =), (14.147) 
0 


in agreement with equation (14.142), not equation (14.145). 


Concept question 14.19. Is {ym a scalar or a 4-vector? Under a Lorentz transformation oY) 
transforms as 


R: Ymp > PRR mR RY = fmt , (14.148) 


which appears to be a scalar. Yet YYymýŲ also looks like it transforms as a 4-vector. Which is it? Answer. 
The transformations of spinors Y = w%e, and vectors a = am considered in this Chapter are active 
transformations, §13.9, which rotate the basis spinors €a and vectors Ym, while keeping coefficients ~* and 
a™ fixed. Under active transformations the combination Yay is indeed a scalar, transforming as 


R: pay > y R RaR Ry = pay . (14.149) 


In fact Yay is a scalar product by construction, as will be explored in greater depth in a later Chapter, §39.5, 
so the fact that it transforms like a scalar should not be a surprise. However, as usual, one is free to make 
choices as to whether a transformation is active (bodily rotates an object) or passive (rotates the frame while 
leaving the object itself unchanged), §13.9. In most of this book, the convention is that transformations are 
passive, meaning that a transformation rotates both the coefficients and basis elements of a spinor Y = Y% €a 
or vector a = aym, while leaving the spinor or vector itself unchanged. With the passive convention, YYmŲ 
indeed transforms as a covariant vector (while Yaw = way transforms as a scalar, the transformation of 
the covariant vector Ym cancelling against the transformation of the contravariant vector a”). The advantage 
of the passive convention is that the transformation properties of an object are evident from the indices 
attached to it. However, the active convention of the present Chapter is needed in order to establish the 
fundamentals of how spinors transform. 


14.11 Null Dirac Spinor 


A null Dirac spinor is a Dirac spinor w~ constructed from a null complex quaternion q acting on the 
rest-frame eigenvector Ept, 


w= err, Gq=0. (14.150) 
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Physically, a null Dirac spinor represents a spin-4 particle moving at the speed of light. A non-trivial null 
spinor must be moving at the speed of light because if it were not, then there would be a rest frame where 
the rotor part of the spinor Y = ARepp would be unity, R = 1, and the spinor, being non-trivial, A Æ 0, 
would not be null. The null condition (14.150) is a complex constraint, which eliminates 2 of the 8 degrees of 
freedom of a complex quaternion, so that a null spinor has 6 degrees of freedom. The null condition qq = 0 
is equivalent to the two conditions 


op =yly =0. (14.151) 


Any non-trivial null complex quaternion q can be written uniquely as the product of a real quaternion AU 
and a null factor (1— In) (Exercise 14.1): 


q=\U(1- In). (14.152) 


Here A is a positive real scalar, U is a purely spatial (i.e. real, with no I part) rotor, and n = tana, a = 1,2,3, is 
a real unimodular vector quaternion, satisfying Nana = 1 with real na. Physically, equation (14.152) contains 
the instruction to boost to light speed in the direction n, then scale by the real scalar and rotate spatially 
by U. The minus sign in front of In comes from that the fact that a boost in direction n is described by a 
rotor R = cosh(6@/2) — Insinh(6/2), equation (14.42), which becomes proportional to 1 — In as the boost 
tends to infinity, 0 > oo. The 1+ 3 + 2 = 6 degrees of freedom from the real scalar A, the spatial rotor U, 
and the real unimodular vector n in the expression (14.152) are precisely the number needed to specify a 
null quaternion. The boost axis n is Lorentz-invariant. For if the boost factor 1— In is Lorentz transformed 
by pre-multiplying by any complex quaternion p+ Ir, then the result 


(p+ Ir) —In) = (p+rn)(1—In) (14.153) 


is the same unchanged boost factor 1 — In pre-multiplied by the real quaternion p + rn, the latter being a 
product of a real scalar and a pure spatial rotation. Equation (14.153) is true because n? = —1. The null 
Dirac spinor 7 corresponding to the null complex quaternion q, equation (14.152), is 


Y = q eqt = AU (1 = In) Ent (14.154) 


The boost axis n specifies the direction of the boost relative to the spin rest frame, where the spin is pure 
up t. Because the boost axis n is Lorentz-invariant, Lorentz transforming a given null Dirac spinor fills out 
only 4 of the 6 degrees of freedom of null spinors. 


Concept question 14.20. The boost axis of a null spinor is Lorentz-invariant. It may seem counter- 
intuitive that the boost axis n of a null spinor is Lorentz-invariant. Should not a spatial rotation rotate the 
boost direction, the direction in which the null spinor moves? Answer. The direction n specifies the direction 
of the boost axis relative to the spin axis. A Lorentz transformation of a null spinor effectively rotates both 
boost and spin directions simultaneously. For example, if the boost and spin axes are parallel in one frame, 
then they are parallel in any frame. 
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Equation (14.153) shows that a Lorentz transformation of the null Dirac spinor 7 (14.154) is equivalent 
to a scaling and a spatial rotation of that spinor, 


(pt+Ir)b = (p + Ir)AU(1 — In) ep = (prn i , (14.155) 


where 
n'=UnU. (14.156) 


The real quaternion p + rn’ on the right hand side of equation (14.155) is not necessarily unimodular (a 
spatial rotor) even if the complex quaternion p+ Ir on the left hand side is unimodular (a Lorentz rotor). As 
a simple example, a Lorentz boost e~/”*/? by rapidity 0 along the boost axis n, equation (14.42), multiplies 
the null spinor (1 — In) epp by the real scalar e?/2. Physically, when a null spinor is Lorentz transformed, it 
gets blueshifted (multiplied by a real scalar). 

The spinor reverse to the spinor (14.154) is 


Y = em 7 = ep (14+ In)rvU . (14.157) 


The spinor is null, gq = 0, because the boost factor is null, (1 + In)(1 — In) = 0. 


14.11.1 Weyl spinor 


A Weyl spinor is a null Dirac spinor in the special case where the boost axis n in equation (14.154) aligns 
with the spin axis. For a right-handed spinor, the boost and spin axes point in the same direction. For a 
left-handed spinor, the boost and spin axes point in opposite directions. If the spin axis is taken along the 
positive 3-direction (z-axis), as in the Dirac representation (14.102), then for a right-handed spinor, the 
boost direction is n = 23, while for a left-handed spinor, the boost direction is n = —13. 

The bivector 23 generates a spatial rotation about the 3-axis, yielding, in the Dirac representation, 7 when 
acting on the spin-up eigenvector, 13 €} = 1€;, equation (14.109). For right- and left-handed Weyl spinors, 
the null boost factor 1 — Ir - n acting on the rest-frame spinor €p} becomes 


(1 — l- n) Ett = (1 F Ing) Ent = (1 F Ii) Ett = (1 aE y5) Ent > (14.158) 


where y5 = —iI is the chiral operator. A general right- or left-handed Weyl spinor may be written uniquely 
as the right- or left-handed basis spinor defined by equation (14.158) pre-multiplied by a positive real scalar 
Aà and a purely spatial rotor U, 


YR = AU (1 + 5)ept - (14.159) 


A Weyl spinor has definite chirality, positive for a right-handed spinor wp, negative for a left-handed spinor 
YL, 


Wr = typ - (14.160) 
L L 
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The complex quaternionic components of the right- or left-handed basis Weyl spinors (14.158) are 


1 0 0 0 
(1 a Ing) Ent = (1 eed T3) Eq = { 00 0 -1 \ A (14.161) 


the Dirac representation of the bivector o3 being given by equations (14.103), which translates to a complex 
quaternion in accordance with the equivalence (14.117). If the components of the real quaternion in equa- 
tion (14.159) are AU = {w, x,y,z}, then the complex quaternionic components of the right- or left-handed 
Weyl spinor are 


T CNE a (14.162) 


z yY x w 


Concept question 14.21. What makes Weyl spinors special? What is special about choosing the 
boost axis n of a null spinor to align with the spin axis? Why not consider null spinors with arbitrary boost 
axis n? Answer. The property that the boost axis aligns with the spin axis is Lorentz invariant. If the boost 
aligns with the spin in one frame, then it does so in any Lorentz-transformed frame. This is the same thing 
as saying that chirality is a Lorentz invariant. In the Standard Model of Physics, §42.1, the fundamental 
fermions are natively massless right- or left-handed Weyl spinors. The fermions acquire their masses through 
interaction with a scalar Higgs field. Right- and left-handed fermions are distinctly different because only 
left-handed fermions (and right-handed antifermions) feel weak interactions. 


The extension of the spacetime algebra to a super spacetime algebra, wherein the spacetime algebra of 
multivectors is shown to be isomorphic to the algebra of outer products of spinors, is resumed in Chapter 39. 
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Geometric Differentiation and Integration 


The problem of integrating over a curved hypersurface crops up routinely in general relativity, for example 
in developing the Lagrangian or Hamiltonian mechanics of a field, Chapter 16. The apparatus developed 
by mathematicians to allow integration over curved hypersurfaces is called differential forms, §15.6. The 
geometric algebra provides an elegant way to understand differential forms. 

In standard calculus, integration is inverse to differentiation. In the theory of differential forms, integration 
is inverse to something called exterior differentiation, §15.9. The exterior derivative, conventionally written 
d (distinguished here by latin font), is the (coordinate and tetrad) scalar derivative operator 


d= dx’ 2 


ae (15.1) 


the wedge A signifying that the derivative is a curl. A more explicit definition of the exterior derivative is 
given by equation (15.63). A closely related derivative is the covariant spacetime derivative D defined by 


D=e’D, =7"D, |, (15.2) 


where D, and D,, are respectively the coordinate- and tetrad-frame covariant derivatives. The exterior 
derivative d is isomorphic to the torsion-free covariant spacetime curl D A^ (see equation (15.67) for a more 
precise statement of the isomorphism), 


de DA. (15.3) 


The first part of this Chapter shows how to take the covariant derivative of a multivector, and defines 
the covariant spacetime derivative D. The second part, starting from §15.6, shows how these ideas relate 
to differential forms and the exterior derivative, and derives the main result of the theory, the generalized 
Stokes’ theorem. 

If torsion is present, then the torsion-full covariant derivative differs from the torsion-free covariant deriva- 
tive, equation (2.68). In sections 15.1-15.4, the covariant derivative D,, and the covariant spacetime derivative 
D signify either the torsion-full or the torsion-free derivative; all the results hold either way. In the theory of 
differential forms, however, starting at $15.6, the covariant spacetime derivative is the torsion-free derivative 
D even when torsion is present. 
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In this Chapter N denotes the dimension of the parent manifold in which the hypersurface of integration 
is embedded. In the standard spacetime of general relativity, N equals 4 and the signature is -+++, but 
all the results extend to manifolds of arbitrary dimension and arbitrary signature. 


15.1 Covariant derivative of a multivector 


The geometric algebra suggests an alternative approach to covariant differentiation in general relativity, in 
which the connection is treated as a vector of operators In, the covariant derivative D, being written 


Dn = On +n. (15.4) 


Acting on any object, the connection operator Î, generates a Lorentz transformation. 

In the spacetime algebra, a Lorentz transformation (13.48) by rotor R transforms a multivector a by a + 
RaR. The generator of a Lorentz transformation is a bivector. The rotor corresponding to an infinitesimal 
Lorentz transformation generated by a bivector P is R = eT/2 
transformation transforms the multivector a by a > a+ ser, aj, where [I,a] = Ta—af is the commutator. 


It follows that the action of the connection operator [,, on a multivector a must take the form 


=1+ iT. The resulting infinitesimal Lorentz 


În„a = }[Ln, a] (15.5) 


for some set of bivectors I’,,. Since rotation does not change the grade of a multivector, [T,,, a] for each n is 
a multivector with the same grade as a. 


Concept question 15.1. Commutator versus wedge product of multivectors. Is the commutator 
s[a, b] of two multivectors the same as their wedge product a ^b? Answer. No. In the first place, the wedge 
product anticommutes only if both a and b have odd grade, equation (13.32). In the second place, the anti- 
commutator selects all grade components of the geometric product that anticommute, per equation (13.28). 
The only case where a ^b = $[a, 6] is true is where either a or b is a vector (a multivector of grade 1), and 
both a and b are odd. 


To establish the relation between the bivectors [T,, and the usual tetrad connections [ymn, consider the 
covariant derivative of the vector a = am: 


Dna = Ona + [Ln a] =YmIna™ + $(Pn, ma” . (15.6) 


Notice that the directed derivative 0, in equation (15.6) is to be interpreted as acting only on the components 
a™ of the vector, not on the tetrad Ym; rather, the variation of the tetrad under parallel transport is embodied 
in the 4[Ln, Ym] term. The expression (15.6) must agree with the expression (11.35) obtained in the earlier 
treatment, namely 


Dna = ymnda” +TE pyka” . (15.7) 
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Comparison of equations (15.6) and (15.7) shows that 
HEr, Ym] = Dhin - (15.8) 


The N-tuple (not vector) of bivectors I, satisfying equation (15.8) is 


Dn = beny" AY (15.9) 


(the factor of 4 would disappear if the implicit summation were over distinct antisymmetric pairs kl of 
indices). Equation (15.9) can be proved with the help of the identity 


FRE Ay! Ym] = 8h" — OK (15.10) 


The same formula (15.5) applies, with the bivector I, given by the same equation (15.9), if the vector a is 
expressed as a sum a = amy” over its covariant am rather than contravariant a” components. In this case 


Dna = y” nam + iEn, Yam (15.11 
which reproduces the earlier equations (11.40) and (11.41), 
Dna = y” nam — TO Yam , (15.12 


since 


iEn y” = -T - (15.13 


The same formula (15.5) with the same bivector (15.9) applies to any multivector, which follows because the 
connection operator [’,, is additive over any product of vectors or multivectors: 


T,ab = 40, ab] = 4(0,,a]b + afr, b] = (Ĉna)b + a(Înb) . (15.14) 


To summarize, the covariant derivative of a multivector a can be written 


Dna = Ona + (En, a] 


(15.15) 


with the N-tuple of bivectors T,, given by equation (15.9). In equation (15.15), as previously in equa- 
tion (15.6), for a multivector a = ya“, the directed derivative 0, is to be interpreted as acting only on 
the components a“ of the multivector, 0,a@ = y4 na^. Equation (15.15) is just another way to write the 
covariant derivative of a multivector, yielding exactly the same result as the earlier method from §11.9. 

The earlier (§11.9) and multivector approaches to covariant differentiation can be combined as needed. 
For example, the covariant derivative of a covariant vector am of multivectors is 


Dy@m = nam — TE nag + EEn, am] - (15.16) 


As always, covariant differentiation is defined so that it commutes with the tetrad basis elements; that is, 
covariant derivatives of the tetrad basis elements vanish by construction, 


Dntm =0. (15.17) 
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For example, equation (15.17) is implied by the equality 
Dr(Yma™) = Am Daa” ; (15.18) 


which is true by construction. 
The covariant derivative of a multivector a can also be expressed as a coordinate derivative 


Oa 
D,a = = +514] , (15.19) 
where the coordinate and directed derivatives are related as usual by 0/0x” = e”, ôn, and where the 


connection vector T, is related to the tetrad connection T, defined by equation (15.9) by 
Tr, =e" Tn. (15.20) 


The components lk = e” Trin of Ty = irka” A y! constitute a coordinate-frame vector, but not a tetrad- 
frame tensor. The connection T, is given by equation (15.20), not by a direct relation to the coordinate-frame 
connections Tvs, that is, Ty Æ SY ef neò. 


15.2 Riemann tensor of bivectors 


As discussed in §2.19.2, the commutator of the covariant derivative defines two fundamental geometric 
objects, the torsion tensor S% and the Riemann curvature tensor Rkimn. The commutator can be written 


IDe, Di] = Sk Dn + Bri , (15.21) 


where S? is the torsion tensor, and the Riemann curvature operator Ryy is an operator whose action on any 
tensor was given previously by equation (2.114). Define the Riemann antisymmetric tensor of bivectors Ry: 
by 


Rr — a Rkimn ay Ay” (15.22) 


(again, the factor of 4 would disappear if the implicit summation were over distinct antisymmetric pairs mn 
of indices). Acting on any multivector a, the Riemann curvature operator yields 


Rua = [Rra] , (15.23) 


which is an antisymmetric tensor of multivectors of the same grade as a. The Riemann tensor of bivectors 
Ry, equation (15.22), is related to the connection N-tuple of bivectors I, equation (15.9), by 


Ry = okr, — OV, + s(Vx, T] + ( H = Tk = SpE m j (15.24) 


where, in conformity with the convention of equation (15.15), directed derivatives 0,T; are to be interpreted 
as acting only on the components Imn: of Ty = Tmn” Ay”, not on the tetrad axes ym. Equation (15.24) 
can be derived either from the tetrad-frame expression (11.60) for the Riemann tensor, or from the expres- 
sion (15.15) for the covariant derivative of a multivector. 
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Transforming equation (15.24) into a coordinate frame, Rya = e*,,.e') Ri, and substituting equation (11.75) 
gives, with or without torsion, the elegant expression 


mr, or, 
so - 52430 Tal], (15.25) 


which can also be written as the commutator 


Rd = 


o o 


1 1 1 
Equation (15.25) is Cartan’s second equation of structure, explored in depth in §16.14.2. The components 
of Rea = EReymn y” AY constitute the Riemann tensor R,.\mn in the mixed coordinate-tetrad basis, 


equation (11.76). 


15.3 Torsion tensor of vectors 


Define the torsion antisymmetric tensor of vectors Ska by (the minus sign is chosen so that equation (15.29) 
resembles equation (15.25)) 


Ski = —Smrr 7 . (15.27) 

In components, the torsion tensor of vectors Sx\ is, from equation (11.49), 

de™y e”, 
S.\ = m _pm |an, 15.28 
À ( Ox" Ox + KX e yY ( ) 
which can be written elegantly 
Oey 0e,, 1 1 

Si. = Set as (Pes ea] — oP eel |s (15.29) 


where e, = e} „yk are the usual tangent basis vectors, and again the coordinate derivative 0/Ox" is to be 
interpreted as acting only on the components e*,, of ex, not on the tetrad axes yz. Equation (15.29) is 
Cartan’s first equation of structure, §16.14.2. Equation (15.29) can also be written in terms of covariant 
derivatives 


Sx = Dee, — Dyer - (15.30) 


15.4 Covariant spacetime derivative 


The covariant derivative Dn, equation (15.4), acts on multivectors, but it does not yield a multivector (it 
yields a vector of multivectors). A covariant derivative that does map multivectors to multivectors is the 
covariant spacetime derivative D defined by 


D=7"D, |. (15.31) 
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The covariant spacetime derivative D is a sum of a directed derivative ô and a connection operator I, 
D=04T, 0=7"4,, T=7T,, . (15.32) 
The action of the connection operator Î on a multivector a is 
fa = 17" [Tna] (15.33) 


(not Ta = 5(I',a]). The covariant spacetime derivative of a multivector a is 


Da = y” Dna = y” (Ona + tEn, a]) . (15.34) 
The covariant spacetime derivative (15.31) can equally well be written in terms of the coordinate deriva- 
tives, 
D=e’D,. (15.35) 
The covariant spacetime derivative (15.34) of a multivector can then also be written 
Da = D,a = e” & ie al) . (15.36) 


Acting on a multivector a, the covariant spacetime derivative D yields a sum of two multivectors, a 
covariant divergence D -a with one grade lower that a, and a covariant curl D ^a with one grade higher 
than a, 


Da=D.a+D^a multivector . (15.37) 
In the particular case that a is a scalar a (a multivector of grade 0), the covariant divergence (defined to be 


one grade lower than a) is zero, D -a = 0. If torsion vanishes, the curl D ^a is essentially the same as the 
exterior derivative in the mathematics of differential forms, §15.9. 


The covariant spacetime divergence and curl of a grade-p multivector a = (1/p!)y"™  "aim...n are 
1 
a= yr" D g Q)m..n ; (D ` Q)m..n = D'atm...n ; (15.38a) 
(p—1)! 
Dia= ay mam (D A Q)klm...n ; (D A a)kim...n = (p + 1) Dir@im...n] : (15.38b) 
(p+1)! 


The factorial factors could be dropped if the implicit summations were taken over distinct antisymmetric 
sequences of indices, but are retained here for explicitness. For example, the components of the covariant 
divergences and curls of a scalar y, a vector A = y" Ap, and a bivector F = sy” A y” Finn, are respectively 


D-p=0, (DAg)n =Dny =Ony , (15.39a) 
D-A=D"A,, (DAA)mn = DmAn—-DnAm ; (15.39b) 
(D . F),, =D" Fran ‘ (DA F)imn =DiFinnt+tDmFrnit+ DnFim - (15.39c) 


A divergence can be converted to a curl, and vice versa, by post-multiplying by the pseudoscalar Iy, 


D (aly) = (D . a)In , D. (aly) = (DAa)In r (15.40) 
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which works because the pseudoscalar Iy is covariantly constant, and multiplying by it flips the grade of a 
multivector from p to N—p. 

The curl of the wedge product of a grade-p multivector a with a multivector b satisfies the Leibniz-like 
rule 


DA(aAb) = (DAa)Ab+ (—)PaA(DAB) . (15.41) 
The square of the covariant spacetime derivative is 


DD=D-D+DAD=D,D* + 44" A7'[Dx, Di , (15.42) 


which is a sum of the scalar d’Alembertian wave operator = D,D*, and a bivector operator whose 


components constitute the commutator of the covariant derivative, equation (15.21). 
For vanishing torsion, the squared spacetime curl of a multivector a vanishes. For example, for a grade 1 
multivector a = a"7p, 


DAD ANA= 57 AY AY" Rpimna” = 0, (15.43) 


which vanishes thanks to the cyclic symmetry of the Riemann tensor, Riktm]n = 0, valid for vanishing torsion. 


Exercise 15.2. Leibniz rule for the covariant spacetime derivative. 
1. What is the covariant derivative D,,(ab) of a geometric product of multivectors a and b in terms of 
covariant derivatives of each of a and b? 
2. What is the covariant spacetime derivative D(ab) of a geometric product or multivectors a and b in 
terms of covariant spacetime derivatives of each of a and b? 
Solution. 
1. The covariant derivative D,,(ab) satisfies the Leibniz rule 


D,,(ab) = (D,a)b+aD pb . (15.44) 


2. If a is a multivector of grade p, then the covariant spacetime derivative D(ab) satisfies the Leibniz-like 
rule 


D(ab) = y” Dm (ab) = y” ((Dma)b + aDmb) = (Da)b + (—)” (— (a - D)b + (aA D)b). (15.45) 


15.5 Torsion-full and torsion-free covariant spacetime derivative 


The results of §15.1-§15.4 hold with or without torsion. 

As in §2.12 and §11.15, when torsion is present and one wishes to make the torsion part explicit, it is con- 
venient to distinguish torsion-free quantities with a ° overscript. The torsion-full and torsion-free connection 
N-tuples T,, and Èn are related by 


T,=In+Kn, (15.46) 
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where the contortion vector of bivectors K,, is defined, analogously to equation (15.9), in terms of the 
contortion tensor Kyi, equation (11.56), by 


K, = Kpm ay Ay! , (15.47) 


implicitly summed over distinct indices k and l. Acting on a multivector a, the torsion-full and torsion-free 
covariant spacetime derivatives D and D are related by 


Da = Da+}4y"[Kn,a] . (15.48) 
From equation (15.25), the Riemann tensor of bivectors R,,, splits into torsion-free and contortion parts, 


AP, + Kx) a(s + Kx) 
Ox" Ox 
= Ra + DeKy — Dy Ky + 3[Kx, Ko] . (15.49) 


Rex = HE+ Ko Dat Ko] 


15.6 Differential forms 


Differential forms, or p-forms, are invariant measures of integration over a p-dimensional hypersurface in 
an N-dimensional manifold. In §13.1 it was seen that the wedge product of p vectors defines a directed 
p-dimensional volume, illustrated in Figure 13.1. A p-form is essentially the same thing, but with the vectors 
taken to be infinitesimals. The purpose of p-forms is to allow integration over p-dimensional hypersurfaces 
in a coordinate-independent fashion. By construction, a differential form is a coordinate (and tetrad) scalar, 
as is essential for integration to be coordinate-independent. 

In an N-dimensional manifold with coordinates x”, a 1-form expressed in the coordinate frame is 


a =a, dx" . (15.50) 


By definition, the differential dx” transforms under coordinate transformations like a contravariant coordinate 
vector. Requiring that the 1-form a defined by equation (15.50) be a coordinate scalar imposes that a, 
must be a covariant coordinate vector. When the 1-form a is integrated over any line (= 1-dimensional 
hypersurface) in the manifold, the result is independent of the choice of coordinates, as desired. 

A 2-form expressed in a coordinate frame is 


a = į apy dz” Adz” , (15.51) 


implicitly summed over all antisymmetric pairs uv. The factor of i cancels the double-counting of pairs, 
ensuring that each distinct antisymmetric pair uv counts once. The factor of 5 could be omitted if the if the 
implicit sum were taken over only distinct antisymmetric pairs wv. The wedge product dx” ^ dx” of differ- 
entials defines a parallelogram, a directed infinitesimal element of area, whose 2-dimensional direction is the 
(dx”-dz” )-plane, and whose magnitude is the area of the parallelogram. The wedge product is antisymmetric, 


dx" A da” = —dx” Adx" . (15.52) 
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The wedge product dx” \ dx” transforms as an antisymmetric contravariant rank-2 coordinate tensor. Re- 
quiring that the 2-form a defined by equation (15.51) be a coordinate scalar imposes that apy must be 
a covariant rank-2 coordinate tensor, which can be taken to be antisymmetric without loss of generality. 
To see that the antisymmetric prescription recovers correctly the usual behaviour of areal elements of inte- 
gration, consider the particular case where the 2-dimensional surface of integration is spanned by just two 
coordinates, x and y, all other coordinates being constant on the surface. Under a coordinate transformation 
{x,y} > {2', y’}, the wedge product of differentials transforms as 
/ / / / t t t t 

da’ \ dy! = (Fede + a) A (Za + way) = E i S a) dz ^ dy . (15.53) 
The factor relating the two areal elements is the familiar Jacobian determinant |O{2’, y'}/3{x,y}|- The 
definition (15.51) of the 2-form a is by construction coordinate-invariant, and is therefore valid when more 


than 2 coordinates vary over the surface of integration. However, it is always possible to erect a local 
coordinate system in which only 2 of the coordinates vary over the 2-dimensional surface of integration. 
In general, a p-form expressed in a coordinate frame is 


1 
a = = Q. AZE A... Ada" (15.54) 
p! 


The factor of 1/p! ensures that each distinct index sequence 41...4p is counted only once. The 1/p! factor 
could be dropped if the implicit sum were taken over distinct antisymmetric sequences of indices. Thus 
equation (15.54) could also be written 

a = ar dr’, (15.55) 


where the sum is only over distinct antisymmetric sequences A of p indices. The wedge product da! A... A dx”? 
of differentials is totally antisymmetric. It transforms like an antisymmetric contravariant rank-p tensor. Re- 
quiring that the p-form a defined by equation (15.54) be coordinate-invariant imposes that a,,,...,, be a 
(without loss of generality antisymmetric) covariant rank-p coordinate tensor. 

The definition (15.54) of a p-form extends to the case p = 0. A 0-form is simply a scalar a. 


15.7 Differential forms in an arbitrary frame 


Differential forms are not restricted to coordinate frames. In any arbitrary tetrad frame, which may or may 
not be a coordinate frame, and which may or may not be orthonormal, the invariant expression (15.55) for 
a p-form may be written 


a=ag dak , (15.56) 


implicitly summed over distinct antisymmetric sequences K of p tetrad indices. Coordinate indices A = k... 
are converted in the usual way to tetrad indices K = k...l using the vielbein e*,, and its inverse ex", 


r imik n On... , Ox =e... Pao . (15.57) 


The entire apparatus of differential forms translates into any arbitrary frame. 
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15.8 Wedge product of differential forms 


The wedge product of differential forms is defined consistent with the wedge product of multivectors, equa- 
tion (13.31). The wedge product of a 1-form a with a 1-form b defines a 2-form 


a^b = adx" Ab,dx” = ay,b,,de" Adz” = 4 (a,b, — aybu) dx” Adz” |, (15.58) 


implicitly summed over both indices and v. If instead the implicit sum were taken over distinct antisym- 
metric pairs uv of indices, then there would be an extra factor of 2 in the third expression, and the ł in 
the last expression would disappear. In general, the wedge product of a p-form a with a q-form b defines a 
(p+q)-form a^ b, 


1 1 v v 
a^b= (aiaia A „Ada ) A (Fonde PAn Nde r) 
1 
= pigi ate Or vq) ae A... Ada”? Ada A... Ada” . (15.59) 


If the forms are expressed as sums a = a, d?x* and b = by dix"! over distinct antisymmetric sequences A 
and II of respectively p and q indices, then their wedge product is 


j 
a\b= a,br eta, All = C abn ee , (15.60) 


where the second expression is implicitly summed over distinct antisymmetric sequences A and II of p and 
q indices, while the last expression is implicitly summed over distinct antisymmetric sequences AII of p+q 
indices. 

The wedge product is symmetric or antisymmetric as pq is even or odd, 


a\b=(-)"bAa, (15.61) 


consistent with the wedge product (13.31) of two multivectors. 
The wedge product of a 0-form (scalar) a with a differential form b is just their ordinary product, 


a\b=ab if ais ascalar , (15.62) 


consistent with the result (13.34) for multivectors. 


15.9 Exterior derivative 


The exterior derivative of a differential form is constructed so that integration and exterior differentiation 
are inverse to each other, §15.12. In the abstract language of differential forms, the exterior derivative is 
denoted d, and the exterior derivative of a p-form a is the (p+1)-form da defined by 


1 10 
da=d (Fom dx! A... ra") = ee dz” Adz A... Ndz" |, (15.63) 
p! p: g” 
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in which the left and right hand sides are implicitly summed over all antisymmetric sets of indices [11.../1p 
and vj01.../4) respectively. Equation (15.63) makes explicit the meaning of the definition (15.1) of the exterior 
derivative d. Thanks to the antisymmetry of the wedge product of differentials, the exterior derivative (15.63) 
may be rewritten 
1 4 4 

da = PUUE dx” Adz"! A... Adr"? , (15.64) 
where 0, = 0/0x”. If the p-form is expressed as a sum a = a, d?x over distinct antisymmetric sequences 
A of p indices, then its exterior derivative is the (p+1)-form 


da = d,ay PH'r”^ = (p + 1) ôpan Pts” , (15.65) 


where the second expression is implicitly summed over indices v and over distinct antisymmetric sequences 
A of p indices, while the last expression is implicitly summed over distinct antisymmetric sequences vA of 
p+1 indices. 
The antisymmetrized coordinate derivative is just equal to the antisymmetrized torsion-free covariant 
derivative (Exercise 2.6), 
Duty. a=} (15.66) 


Hp] AT ETT] ? 


which is true even when torsion is present (that is, the antisymmetrized coordinate derivative equals the 
antisymmetrized torsion-free covariant derivative, not the antisymmetrized torsion-full covariant derivative). 
The antisymmetrized coordinate derivative is a covariant coordinate tensor despite the fact that the deriva- 
tives are coordinate not covariant derivatives, and this is true whether or not torsion is present. Thus the 
exterior derivative da is coordinate-invariant, with or without torsion. In an arbitrary frame, not necessarily 
a coordinate frame or an orthonormal frame, the exterior derivative of a p-form a is its torsion-free covariant 
curl, 


da = (p + 1)Dinax] prog . (15.67 


implicitly summed over distinct antisymmetric sequences nK of p+1 tetrad indices. 
The simplest case is the exterior derivative of a 0-form (scalar) y, which according to the definition (15.63 
is the one-form 
dp 
Ox” 
The next simplest case is the exterior derivative of a one-form a, which according to the definition (15.63 
is the 2-form 


dy = dx” . (15.68 


Vv Oa, v 
da = d(a, dz”) = Jar dx" \ dx 
_1lfda, Oa, j 7 
= & = set) dx” \ dx (15.69) 


= a dis — Dra) dx” Ndx” , (15.70) 
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implicitly summed over both indices u and v. The factors of 4 would disappear if the sum were only over 
distinct antisymmetric pairs uv. 

The exterior derivative of the wedge product of a p-form a with a q-form b satisfies the same Leibniz-like 
rule (15.41) as the spacetime curl, 


d(a ^b) = (da) Ab + (—)PaA(db) . (15.71) 


15.9.1 The square of the exterior derivative vanishes 


The exterior derivative has the notable property that its square vanishes, 


1 
dda = Opry Au up] dx”! Adr”? Adz"! A... Adz"? =0], (15.72) 
p! 


because coordinate derivatives commute. The analogous statement in the geometric algebra is that the 
torsion-free covariant spacetime curl squared of a multivector a vanishes, equation (15.43), 


DADAa=0. (15.73) 


15.10 Hodge dual form 


The Hodge dual “a of a differential form a is most easily defined by taking advantage of the isomorphism 
between the geometric algebra and differential forms. The Hodge dual of a multivector a is defined to be the 
multivector Iya obtained by premultiplying by the pseudoscalar Iy, equation (13.24). 

The pseudoscalar Iy can be expressed as 


In = eMym ; (15.74) 


where M runs over the one distinct antisymmetric sequence 1...N of N indices, and e™ is the total antisym- 
metric tensor normalized to et = 1 in an orthonormal frame, as is the convention of this book. Thus the 
dual Iya of a grade-p multivector a = a* yg may be written 


Ina = Ina* yx = e Kyrk ak yk = (—)P/l K ayr = (—)P/lergať yt 5 (15.75) 


implicitly summed over distinct antisymmetric sequences K of p indices, and the one distinct sequence L 
of q = N-—p indices complementary to K. In the third expression of equations (15.75), the indices LK 
of the pseudoscalar yzg have been ordered without loss of generality to end with the sequence K. The 
associativity of the multivector product means that yrKYK = YL(YK7Y«K); the (—)/2] factor comes from 
the square ygyx of a grade-p multivector, which in an orthonormal frame is 


YKYK = Yki ...kp kı ..kp = (Pyk k on Ykpkp 4 (15.76) 


with 74; the orthonormal tetrad metric (the Euclidean metric if all dimensions are spatial, or the Minkowski 


396 Geometric Differentiation and Integration 


metric if one dimension is a time dimension). Equation (15.75) can be cast as a sum over dual multivectors 
InYkK; 


Iya =a" (Inv), Inge = (-)P ergy" = (-)PP gry , (15.77) 


where L runs over the one distinct antisymmetric sequence complementary to K. The (—)?% factor in the 
rightmost expression of equations (15.77) comes from permuting the p indices K and q indices L through 
each other, ezg = (—)”4e xL. Alternatively, equation (15.75) can be cast as a sum over dual coefficients *a”, 


Ina = ‘aby, ) a" = (—)/le Kan , (15.78) 


where K runs over the one distinct antisymmetric sequence complementary to L. 
The dual *a of a p-form a = a, dx‘ is defined to be the q-form, analogously to the multivector dual (15.75), 


*a = (—)P lenad’ dix" |, (15.79) 


implicitly summed over distinct antisymmetric sequences A of p indices, and the one distinct sequence II of 
q = N—p indices complementary to A. As with the multivector expression (15.77), the form dual (15.79) can 
be cast as a sum over dual volume elements *diz^, 


“a =ay*dte® , “di^ = (—) Plena dir! = (—)P/lHae yy dir" . (15.80) 


The dual volume element *d%x“ is an element of a q-dimensional space, but its index A is a totally anti- 
symmetric sequence of p = N—q indices. Alternatively, as with the multivector expression (15.78), the form 
dual (15.79) can be cast as a sum over dual coefficients, *arr, 


“a = “an dix! 5 “an = (—)P/ lenat " (15.81) 
Taking the double dual of a multivector a multiplies it by the pseudoscalar squared J%,, 


“a = Ia = (Ma , (15.82) 


where the + sign is the determinant of the orthonormal tetrad metric (+ for the Euclidean metric, — 
for the Minkowski metric). The same result (15.82) holds for the double dual “a of a differential form 
a. The same factor +(—)'%/?] can be deduced in a lengthier fashion by taking the double dual along the 
lines of equation (15.75). There is a factor of (—)!*/?] from taking the dual of the grade-p vector a, as in 
equation (15.75); a further factor of (—)l9/?] comes from taking the dual of the grade-q dual vector *a; a 
factor of (—)?4 comes from permuting indices of the pseudoscalar, e+% = (—)P%e*"; and a final + sign, the 


determinant of the tetrad metric, comes from eye“ = +. The overall sign is, for any p and q = N—p, 


+ (—)[P/21+la/2]+pa — 4.(_)1N/2] | (15.83) 


The reader may check that equation (15.83) holds for all values of p+q = N, with each of p and q either 
even or odd. 
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Concept question 15.3. Calculating with the totally antisymmetric tensor. The most difficult 
thing in all of mathematics is getting the sign right. This is certainly true with the totally antisymmetric 
tensor. Is there a sure fire way to get the sign right? Answer. The key point is that there is an isomorphism 
between the totally antisymmetric tensor and the geometric algebra. In an orthonormal frame, 


et ga AL AY! (15.84) 


In a general coordinate frame, 


eA ere A...Ae. (15.85) 


Indices are raised and lowered, and transformed between tetrad and coordinate frames, in the usual way, 
using the tetrad and coordinate metrics, and the vielbein. 


15.11 Relation between coordinate- and tetrad-frame volume elements 


Consider a p-dimensional hypersurface embedded inside an N-dimensional manifold. Choose an orthonormal 
tetrad such that the first p basis elements -y1,...,Yp of the tetrad are tangent to the p-dimensional hyper- 
surface, while the last N — p basis elements p41, ...,yn are orthogonal to it. (Such a choice is not always 
possible. An example is the case of an integral along a null geodesic. But even in that case an integral can 
be defined — the affine distance — by a suitable limiting procedure. Whatever the case, if an integral can 
be defined, some version of the equations below applies.) With respect to an orthonormal tetrad frame, 
the components d?x!-? of the p-volume element transform like the p-dimensional pseudoscalar I„. Thus the 
orthonormal tetrad-frame p-volume element is invariant, the proper p-volume element. The coordinate- and 
tetrad-frame p-volume elements, which are tensors, are related by the vielbein in the usual way, leading to 
the result that 


Ci, eee rade, (15.86) 


where Creare is the determinant of the p x p vielbein matrix e™,, with m running from 1 to p and p running 
from [11 tO Lp, 


1 1 
Cru, ss Opp 


ey? =|: a eee (15.87) 
EP ius 

Equation (15.86) is summed over the (1Y) distinct sets of p coordinate indices /1:.../, drawn from the N 
coordinate indices. Equation (15.86) implies that ee " d?x'# is the proper p-volume element. 


-H 
Dual proper q-volume elements are related similarly, 


a *dīgtt Hp — *dígt P (15.88) 
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15.12 Generalized Stokes’ theorem 


The most important result in the mathematics of differential forms is a generalization of the theorems of 
Cauchy, Gauss, Green, and Stokes relating the integral of a derivative of a function to a surface integral of the 
function. In the mathematicians’ compact notation, the generalized Stokes’ theorem says that the integral 
of the exterior derivative da of a p-form a over a (p+1)-dimensional hypersurface V equals the integral of 
the p-form a over the p-dimensional boundary OV of the hypersurface: 


[a = fa . (15.89) 


More explicitly, if a = ay,...., d?a"'"? is a p-form, Stokes’ theorem states 


for DO Mii eer =$ üp.. up PENH? , (15.90 
V OV 


implicitly summed over distinct sequences V1...) and }11.../l) Of respectively p+1 and p indices. In an 
arbitrary frame, not necessarily either a coordinate frame or an orthonormal frame, Stoke’s theorem (15.90 
is 


[rar dPttyt = f (+I) Dpaxy dtig =$ ag Pz” |, (15.91 
V V OV 


implicitly summed over distinct sequences L = nK and K of respectively p+1 and p indices. In a coordinate 
frame, the torsion-free covariant curl reduces to an ordinary curl, Dp aay = pan, Exercise 2.6. 

In the case of a 0-form (scalar) y, the exterior derivative dy, equation (15.68), is the total derivative. The 
integral of the 1-form dy along any line (1-dimensional hypersurface) x(A) parametrized by an arbitrary 
differentiable parameter A, from initial value Aọ to final value A1, is 

AL At dp ` AL dp dx’ AL dy 
N dy = N aye =e au a Ê E” Do= p(A1) — (Ao) . (15.92) 
Equation (15.92) can be recognized as the fundamental theorem of calculus. Equation (15.92) is equa- 
tion (15.89) or (15.91) for the case where a is the 0-form (scalar) y. The hypersurface V is the 1-dimensional 
path of integration. The boundary OV is the two endpoints of the path. 

Here is a sketch of a proof of the generalized Stokes’ theorem (15.89). The key ingredient is that da is 
coordinate-invariant, so one can use any convenient coordinate system to evaluate the integral, and the result 
will be independent of the choice of coordinates. 

First, partition the hypersurface V into rectangular patches. Rectangular means that a system of coordi- 
nates can be chosen such that the patch extends over a fixed finite interval aA agra x of each coordinate. 
Figure 15.1 illustrates a partition of a 2-dimensional disk into five rectangular patches. Thanks to the ar- 
bitrariness of the choice of coordinates, although each patch appears to be non-rectangular, coordinates 
can always be chosen so that the patch is rectangular with respect to those coordinates. Notice that the 
(p+1)-dimensional hypersurface could be embedded in a higher dimensional manifold, so there could poten- 
tially be more coordinates available than the dimension of the hypersurface; but again the arbitrariness of 
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Figure 15.1 Partition of a disk into five rectangular patches. The arrowed circles show the direction of circulation of 
the integral over the boundary of each patch. 


coordinates means that coordinates can always be chosen such that only p+1 of them vary over the (p+1)- 
dimensional hypersurface, the remaining coordinates being constant. With such convenient coordinates, the 
integral over a patch is a straightforward integration in Euclidean space. For simplicity, consider the integral 
in 2 dimensions. The integral over a single rectangular patch £o < x < xı and yo < y < yı is 


yı Tı 
/ aa = f J (Fe- 52) dx \ dy 
patch Yo £o Ox Oy 

Yı Tı Tı yı 

= ( Se ae) ady- f ( ee ay) ada 
Yo To Ox To Yo Oy 
yı Tı ay ) ma ( yı ðar ) 

= — dz | dy — dy | dx 

L ( To Ox xo Yo Oy 


= f” taylor) ~ ay(v0)) dy- f” laelr) ~ av(yo)] de 


o xO 


=$ a, dz” =$ a. (15.93) 
Opatch Opatch 


The first line of equations (15.93) is the standard expression (15.70) for the exterior derivative of a 1-form 
a; the double count over pairs of indices eliminates the factor of Z, The second line of equations (15.93) 
rearranges the first. The third line of equations (15.93) differs from the second by the loss of the ^ signs; 
the equality holds because f (0a,/Oy) dy is a scalar for any interval of integration, and the wedge product of 
a scalar with a differential form is just the ordinary product of the scalar with the form, equation (15.62). 
The fourth line of equations (15.93) follows from the fundamental theorem of calculus, equation (15.92). The 
integral contains 4 contributions, corresponding to the 4 edges of the rectangular patch. The signs of the 
4 contributions are such that they circulate anti-clockwise about the patch, as illustrated in Figure 15.1. 
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The last line of equations (15.93) expresses the fourth in more compact notation, with Opatch denoting the 
boundary, the 4 edges, of the patch. Equations (15.93) prove Stokes’ theorem for a patch. 

The final step of the proof is to add together the contributions from all the patches of the partition. 
Where two patches abut, the contributions from the common edge cancel, because consistent circulation 
about the boundaries causes the integral along the common edge to be in opposite directions, as illustrated 
in Figure 15.1. Once again, the coordinate-invariant character of the differential form a ensures that the 
integral along a prescribed path is independent of the choice of coordinates, so the contributions from 
abutting edges of patches do indeed cancel. 


15.13 Exact and closed forms 


Consider the 1-form d@ defined by the exterior derivative of the azimuthal angle ¢ around a circle. The 
integral of the angle around the circle is 
27 
dọ = 2r . (15.94) 
0 

But since the circle has no boundary, should not Stokes’ theorem imply that the integral vanishes? The 
resolution of the problem is that ¢ is not a single-valued scalar. The 1-form dọ constructed from ¢ is well- 
defined, being single-valued and continuous everywhere on the circle, but @ itself is not. The circle can be 
cut at any point, and a single-valued scalar ¢ defined on the cut circle. But since the scalar is discontinuous 
at the cut point, the contributions on the boundary do not cancel, but rather produce a finite contribution, 
namely 27. 

A differential form F is said to be exact if it can be expressed as the exterior derivative of a differential 
form A, 


F=dA. (15.95) 


Stokes’ theorem implies that an integral of an exact form over a surface with no boundary must vanish. The 
condition of exactness is a global condition. The above example 1-form d¢ in equation (15.94) is not exact, 
because ¢ is not a single-valued 0-form (scalar). 

A differential form F is said to be closed if its exterior derivative vanishes, 


dF =0. (15.96) 


The rule dd = 0 implies that every exact form is closed. The inverse theorem, that every closed form is 
exact, is true locally, but not globally. Poincaré’s lemma states that a form that is closed over a volume 
V that is continuously contractible to point is exact over that volume. The condition of being closed can be 
thought of as a local test of exactness. The example form dọ is closed, but not exact. In the Cartesian x-y 
plane, the 1-form dọ is 

x dy —ydx 


dọ = datan(y/x) = Fy 


, (15.97) 
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which is singular at the origin x = y = 0. Consistent with Poincaré’s lemma, the 1-form dọ is not continuously 
contractible to a point. 

The above example illustrates that topological properties of differentiable manifolds, such as winding 
number, can be inferred from the behaviour of integrals. 


15.14 Generalized Gauss’ theorem 


In physics, Stokes’ theorem (15.91) is most commonly encountered in the form of Gauss’ theorem, which 
relates the volume integral of the divergence of a vector to the integral of the flux of the vector through the 
surface of the volume. The relation (15.40) shows that a covariant curl as required by Stokes’ theorem (15.91) 
can be converted to a covariant divergence by post-multiplying by the pseudoscalar Iy, 


DA(aly) =(D-a)Iy . (15.98) 


If a = ax d’x* is a p-form, substituting equation (15.98) into Stokes’ theorem (15.91) gives the generalized 
Gauss’ theorem in an arbitrary (not necessarily coordinate or orthonormal) frame, with q = N—p, 


(15.99) 


fb -a)L *qitlyb a D”ang “dtr =$ ax *dirť 
V V av 


where (D - a), denotes the components of the torsion-free covariant divergence, equation (15.38a), *d%* 
denotes the dual g-volume element, equation (15.80), and K and L are implicitly summed over distinct 
antisymmetric sequences of p and p—1 indices respectively. 

In the mathematicians’ notation, Gauss’ theorem (15.99) is 


(jeri) a *(da) = (r g ‘a, (15.100) 


ov 


the signs coming from commuting the pseudoscalar Iy through da on the left hand side and through a on 
the right hand side. Equivalently, 


(= f (aa) = f aas fa. (15.101) 


the (—)‘—! sign coming from commuting the pseudoscalar through the 1-form exterior derivative d. 

In the remainder of this book, the dual q-volume element *d%*-~! is often abbreviated to d%*-! without 
the Hodge star symbol, since the dual nature is usually evident from the number of indices k...1, which is q 
for the standard g-volume, or p = N—q for the dual qvolume. The only ambiguity occurs when q = p = N/2. 
For example, the dual N-volume element, which is a scalar, will be abbreviated to dz, whereas the standard 
N-volume element, which is a pseudoscalar, is written dNa!. 

Beware that physics texts commonly use dg to denote the pseudoscalar N-volume, and edx or equiv- 
alently /—g da^ x to denote the dual scalar N-volume. The common physics convention seems designed to 
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confuse the smart student who expects a notation that manifests, not obscures, the transformational prop- 
erties of a volume element. 

The simplest and most common application of Gauss’ theorem is where a = a,,dx™ is a 1-form, in which 
case 


f Drandke= g ade |, (15.102) 
V OV 


where, as just remarked, dx and dN—!x” denote respectively the dual scalar N-volume and the dual vector 
(N—1)-volume. 


15.15 Dirac delta-function 


A Dirac delta-function can be thought of as a special function that is infinity at the origin, zero everywhere 
else, and has unit volume in the sense that it yields one when integrated over any region containing the 
origin. In curved spacetime, in order that the integral be a scalar, the p-dimensional Dirac delta-function 
must transform oppositely to the p-dimensional volume element. 

The p-dimensional Dirac delta-function 5?(x) is defined such that for any scalar function f(x), the integral 
over any p-volume element containing the origin x = 0, 


fro (x) Px” = f(0), (15.103) 


yields the value f (0) of the function at the origin. The p-dimensional Dirac delta-function is an antisymmetric 
tensor of rank p, with components 6%-(x), where K runs over distinct antisymmetric sequences of K indices. 

The dual qg-dimensional Dirac delta-function *6%(x) with q = N—p, is defined to behave similarly when 
integrated over the dual q-volume element *d%x* defined by equation (15.80), 


fro "Oi (x) “d%a* = f(0) . (15.104) 


The dual g-dimensional Dirac delta-function *64(x) is an antisymmetric tensor of rank p, with components 
*07-(2) where K runs over distinct antisymmetric sequences of p indices. 

As with the dual g-volume, the dual Dirac delta-function *6/ (x) will often be abbreviated in this book 
to ôl (x) without the Hodge star symbol, since the dual nature can usually be inferred from the number p 
of indices k...1. 

The most common use of the Dirac delta-function is in integration over N-dimensional space, 


fro 5% (a) dNx = f(0) |, (15.105) 


where ôM (x) and dNx denote respectively the dual scalar N-dimensional Dirac delta-function, and the dual 
scalar N-volume. The lack of indices on 6‘ (x) and dz signals that they are scalars. 
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15.16 Integration of multivector-valued forms 


In Chapter 16, §16.14, it will be found that the Hilbert action of general relativity takes its most insightful 
form when expressed in the language of multivector-valued forms. These are forms whose coefficients are 
themselves multivectors, 


a = ay dr^ = aga yÉ Pr , (15.106) 


implicitly summed over distinct sequences K of multivector indices and distinct sequences A of p coordinate 
indices. The advantage of the multivector-valued forms notation is that it makes manifest the two distinct 
symmetries of general relativity: Lorentz transformations, encoded in the transformation of the multivector, 
and translations (coordinate transformations), encoded in the transformation of the form. 

Stokes’ theorem for multivector-valued forms is an immediate generalization of Stokes’s theorem (15.89) 
for forms: the integral of the exterior derivative da of a p-form multivector a, equation (15.106), over a 
(p+1)-dimensional hypersurface V equals the integral of the p-form multivector a over the p-dimensional 


boundary OV of the hypersurface: 
| da -=$ a. (15.107) 
4 av 


In other words, the fact that the coordinate components a, of the form a are themselves multivectors leaves 
Stokes’ theorem intact and unchanged. 


Exercise 15.4. Action principle for strings and branes in arbitrary dimensions. The action for a 
point particle is, up to a factor, the integral of the proper time along the worldline of the particle, equa- 
tion (4.7). Similarly, a consistent action for a (p—1)-dimensional object in N-dimensional spacetime is, up 
to a factor, the integral of the proper area of the p-dimensional worldtube of the object. String theorists call 
such an object a (p—1)-brane, with p = 1 for a point particle and p = 2 for a string. Let A°, a = 1,...,p, be 
p coordinates on the p-dimensional worldtube of the brane. The action of the (p—1)-brane is 


f f 
Sp = -n f dA = -u f Up ae (15.108) 
1 1 
where d?X is the dual scalar p-volume element, d?\!--? is the pseudoscalar p-volume element, and e = ae 
is the vielbein determinant, equation (15.87). The action has units of mass x length (angular momentum), 
so the constant u, the tension of the brane, has units of mass/length?~!. For example, for a string, p = 2, 
the tension p has dimensions of mass per unit length. Notice that it is built into the action (15.108) that the 
tension p of the brane, its mass per unit proper length?—', is constant. Thus the brane behaves like a thin 
shell with a vacuum internal equation of state. The minus sign in the brane action (15.108) arises for the 
same reason as the minus sign in the action (4.7) for a particle: when one dimension is timelike, the principle 
of least spatial area is replaced by the principle of most spacetime area. A positive u implies a positive proper 
mass /length?~+ of the brane. For strings, p = 2, the action (15.108) is known as the Nambu-Goto action. 
1. Derive the equations of motion that follow from the action (15.108). 
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2. As previously for a point particle, §4.3, the standard version of the brane Lagrangian, equation (15.113), 
involves a square root, and is not in (super-)Hamiltonian form. Recast the action (15.108) into (super-) 
Hamiltonian form. 
3. Derive the energy-momentum tensor of the brane. 
Solution. 
1. Standard Lagrangian. The Lagrangian of the (p—1)-brane with action (15.108) is Lp = —pe. The 


Lagrangian approach requires that the Lagrangian be expressed in terms of coordinates and velocities. 
Let x” be N coordinates of the N-dimensional spacetime in which the brane propagates, with line- 
element ds? = g,,, dx“dx”. The coordinates x on the worldtube are functions x(A°) of the worldtube 
coordinates \°. The induced metric hag on the p-dimensional worldtube of the brane, satisfying ds? = 
hag d\“dX°*, is related to the metric guv Of the full spacetime by 


Ox" Ox” 


hag = Juv Jya DA = guvu” au” g j (15.109) 
where the velocities u, are defined by 
ox” 
ufa = ae , Pasa, eH aap (15.110) 


In terms of an orthonormal tetrad whose first p vectors Ya are tangent to the worldtube, the metric hag 
of the worldtube is 


hag = Nab eae - (15.111) 
Coordinate indices u,v, ... are raised and lowered with the spacetime metric guv, while worldtube indices 


a, 8,... are raised and lowered with the worldtube metric hag. The determinant h of the metric on the 
worldtube is 


h = |haal = |naslle*alle’a| = —e? , (15.112) 


where e is the vielbein determinant, the same determinant as that in the action (15.108). The minus 
sign in equation (15.112) assumes that the worldtube progresses in time, so that one of the dimensions 
of the worldtube is timelike, hence |nab| = —1. The Lagrangian of the (p—1)-brane is then 


Ly = —pe = -pv = =y hag] = Hy (Jurta p] - (15.113) 


The variation of the Lagrangian satisfies, from equation (2.77) for the variation of a determinant, 


Ox! Ox” 
Iw Bye Ans 


e 'bLy = —uôlne = —tydlnh = — iuh? bhag = —4uh’ ô (guvu au” g) . (15.114) 
The variations of the Lagrangian with respect to the velocities ua and coordinates x“ are therefore 


a — Lp aß Ox” à 
Pk“ = A —peh Gru gya = THEU” 5 (15.115a) 


Lp i a8 Juv Ox" OL” 


dan BEN Baw AZ OM 


pel purut ua ; (15.115b) 
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the variation (15.115a) with respect to velocities u"a defining the generalized momenta p“. In equa- 
tion (15.115b) the coordinate derivatives of the metric have been replaced by torsion-free coordinate 
connections (Christoffel symbols), equation (2.59); the connections are torsion-free because the co- 
ordinate connection symmetrized over its first two indices is the torsion-free coordinate connection, 
Vuv)n = Pies equation (2.64). Linearity of the derivative, equation (4.3), implies that the variation of 
the velocity equals the velocity of the variation, 6(02"/O0A°) = 0da"/OA°. The variation of the brane 


action with respect to coordinates and velocities is then 


i Oia? DL, 
= a K p\1...p 
ôSp f ( sD + Apne Ja A : (15.116) 


The first term on the right hand side of equation (15.116) can be integrated by parts. To do so, recall 
that Gauss’ theorem (15.99) involves the integral of a torsion-free divergence, which in the present 
application takes the form 


[Boo d?) = fas et (15.117) 


with dPA = edP\!-P = ey? dPA P and dPTIAY = ey ht dP-1\!-%~P the dual proper p and p—1 


volume elements. The integration by parts is accomplished through 


a 3x" _ o PE Op. kF A E k Op.” k 
Dis’ oye T Bie (p,.° da") — Dro ôx" = eDa (e7 pdx") — Dra ôx". (15.118) 
The variation of the action (15.116) becomes, after integration by parts, 
f f 
g Op,.* OL . , 
6 = = Ha £ qp-1 a J K P Sx" dP Lp . 15.11 
Sp f e€ Pkaðt A | ane 7 Bee x” dPX (15.119) 


As usual, application of least action requires the coordinates to be held fixed on the boundary, so 6a" 
vanishes on the boundary, and the surface term in equation (15.119) vanishes. Requiring the variation 
of the action to vanish for all possible variations of 62" on the worldtube then implies the equation of 


motion 

Op,” ÖL o 

oC = n = piel ye W a (15.120) 
The equation of motion (15.120) may also be written as the vanishing of the torsion-free covariant 
divergence of the velocity us“ = —p,.~/(e), 


Daur” = bauk” +13 ie? —T gt a = 0, (15.121) 
in which the connection term re, = OIne/OX* (all worldtube indices), equation (2.79), enforces covari- 
ance with respect to the worldtube coordinate index a of the velocity, while the connection term Din 
(all external indices) enforces covariance with respect to the external coordinate index « of the velocity. 

As a check, for a point particle, p = 1, of mass u = m, the Lagrangian (15.113) is Lm = —me with 
e = dr/dX, equation (4.8), the induced metric (15.136) is hoo = —e? with inverse h°? = —e~?, the 
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momentum (15.115a) is pe = me~tg, dx” /d\ = mg, dx” /dr, and the equation of motion (15.120) 
reduces to 
dD. > dæ” dz” dX 
SmE an 15.122 
dà MY GN dÀ dr eae) 


in agreement with equation (4.12). 


. (Super-) Hamiltonian. The standard brane Lagrangian (15.113) is not in (super-)Hamiltonian form. 


To take Hamiltonian form, the brane action must take the form, analogous to the action (4.69) for a 
point particle, 


f ox” N 
S, =| (oe Seo e Hy) PAL? , (15.123) 
with brane Lagrangian 
a Ox" 


The brane Hamiltonian is to be considered as a function Hp(£“,p,„®) of independent coordinates z” and 
momenta p„“. However, as in the case of a point particle, §4.6, the equations of motion must be indepen- 
dent of the arbitrary coordinates A% that label the worldtube: the Hamiltonian must be reparametriza- 
tion independent. To achieve independence with respect to the choice of worldtube coordinates, it is 
necessary to treat the brane Hamiltonian as a function H,(2",p,,°,hag) not only of coordinates and 
momenta, but also of an independent worldtube metric hag. Invariance of the Hamiltonian with respect 
to variations of the worldtube metric hag emerges as an equation of motion (15.128c). 
The variation of the first term in the integrand of the action (15.123) is 
Ox! Ox Ooat' ox” o p, | 
ô a ) = bp tS — + pu? = pu“ + — (puô) — "Sa" 
(0, 9 Put aya t Pu aya = Pe aya + pe Pe Oa") ae 
x” 2 Li Op, 
= OP" Da +eDa (e7 pdx") — Bra ba! , (15.125) 
The term involving the torsion-free divergence integrates by parts to a surface term. The variation of 
the action (15.123) with Hamiltonian H,(2",p,,%, hag) is 


f 
55, = $ e-l puaðe" dP-'° 


f dr" H, p,“ ôH, ôH, 
a H P ip\ 1...p 
+f (ov, & =) ( Den ma) Ja Ths sho») PAP (15.126) 


The surface term vanishes provided that the coordinates are held fixed, da“ = 0, on the boundary. 
The (super-)Hamiltonian that correctly recovers the relation (15.115a) between brane velocities and 
momenta, and the brane equations of motion (15.120), is 


Vn Q p—2 e 
Hy = hagg” Pu p? = a ; (15.127) 


E 
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in which e is to be considered a function e = /—h = ,/ —|hag]| of the worldtube metric hag. Coordinate 
indices ju,V,... are raised and lowered with the spacetime metric gv, while worldtube indices a, 8, ... 
are raised and lowered with the independent worldtube metric hag. 

The equations of motion that follow from the vanishing of the variation (15.126) of the brane action 
with the brane Hamiltonian (15.127) are 


Ox" ôH, 1 
b = = P ha pv oe 15.128 
U rw Op,” ue Bg P , ( a) 
Ops __ Hp _ 1, agi” 
= = hag a Pa ; 15.128b 
Orn ôx? Que Pga" P i 
o= SB LL (gp, p’ — Lhè? (p,p; — (p-2)(He))) - (iaai 
hag 2ue jj 2 a 


Equation (15.128c) imposes that h°’ be proportional to g” p,“p,’. Taking the trace of equation (15.128c), 
and bearing in mind that hogh°? = 6% = p (the brane dimension), implies the normalization condition 


“Coho Ape (pu%p"4 — p(ue)*) . (15.129) 


For all branes except strings, equation (15.129) implies the normalization 
go pupi? = (ue) h?  forp #2. (15.130) 


The reason there is no normalization condition for strings is that, under a conformal rescaling of the 
worldtube metric hag by a scale factor a, 


hag > @hop, h= |hag|>a”h, e=V-ha?e, (15.131) 
the brane Hamiltonian (15.127) transforms as 


ad (p—2) ea? 
H, > = hago’ p,*0," — 2AP 15.132 
p gpa Rood Pu" 5 ( ) 


which implies that the Hamiltonian (15.127) is conformally invariant for a string, 
H, + H, for p=2. (15.133) 


The conformal invariance (15.133), commonly called Weyl invariance, of the string Hamiltonian implies 
that equation (15.129) is satisfied automatically without any normalization condition on hag. The 
conformal invariance of the string Hamiltonian is at the heart of some of the magic of string theory. 

For non-strings, p # 2, the normalization (15.130) along with equations (15.128a) and (15.128c) 
recover the relation (15.136) for the worldtube metric hag. After the normalization (15.130) is imposed, 
the value of the brane Hamiltonian (15.127) is 


H, = e(l- p) forpA2. (15.134) 
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The value of the brane Lagrangian (15.124) is 
Ly = py uo. — Hp = —pe for p42, (15.135) 


in agreement with the standard brane Lagrangian (15.113). 
For strings, p = 2, let the true worldtube metric be denoted with a happy sign, hag, 
hag = JuvU ot g . (15.136) 
For strings, equation (15.128c) implies that hag is proportional to the true worldtube metric hag, but 
leaves the normalization arbitrary. For strings, the Hamiltonian (15.127), with momenta p,“ eliminated 
in favour of velocities u” using equation (15.128a), is 


H, = — yeh Phap for p=2. (15.137) 


But for strings, the product eh°? with e = V—h is unchanged by the normalization of hag, so can be 
replaced by čh®® with č = v —h. Thus the string Hamiltonian (15.137) is, regardless of the normalization 
of hafi 

H, = -uë forp=2, (15.138) 


which agrees with the non-string Hamiltonian (15.134) if e is interpreted as the true vielbein determinant 
é. The value of the string Lagrangian (15.124) is 


Lp =-uë forp=2, (15.139) 
again in agreement with the standard brane Lagrangian (15.113). 
Equation (15.128b) with (15.128a) implies 


Op, 


ee -peh punu aug , (15.140) 


which recovers the brane equation of motion (15.120), for branes of arbitrary dimension p. For strings, 
p = 2, the factor eh®” in the equation of motion (15.140) can be replaced by Eh , affirming that 
equation (15.140) is correctly normalized also for p = 2. 

For a point particle, p = 1, of mass u = m, the brane Hamiltonian (15.127) reduces to the nice 
Hamiltonian (4.96) (absent electromagnetism) if the scale factor a in the latter is identified with 


T (15.141) 
m 


where e = \/—|hool- 


For a string, p = 2, the brane Lagrangian (15.124) with Hamiltonian (15.127) is essentially the 
Polyakov (1981) Lagrangian. 


The energy-momentum tensor Taa Of a brane is obtained by varying the brane action Sp with respect to 
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the worldtube vielbein e*“, equation (16.117). The energy-momentum tensor of the brane is proportional 
to the worldtube vielbein, so the brane has a vacuum equation of state, 


Lp de 


Taig = = = aai 15.142 
cae ae ( ) 

In the coordinate frame, the brane energy-momentum is 
Tag = —uhag . (15.143) 


The brane Lagrangian is independent of the Lorentz connections Paba, so the brane carries no torsion, 
equation (16.121). 
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Action principle for electromagnetism and 
gravity 


One of the profound realisations of physics in the second half of the twentieth century was that the four forces 
of the Standard Model of physics — the electromagnetic, weak, strong (or colour), and gravitational forces 
— all emerge from an action that is invariant with respect to local symmetries called gauge transformations. 
Gauge transformations rotate internal degrees of freedom of fields at each point of spacetime. 

The simplest of the forces is the electromagnetic force, which is based on the 1-dimensional unitary group 
U(1) of rotations about a circle. Since the mid 1970s, the electromagnetic group has been understood to 
be the unbroken remnant of a larger electroweak group Uy (1) x SU(2), which through interactions with a 
scalar field called the Higgs field breaks down to the electromagnetic group U(1) at collision energies less 
than the electroweak scale of about 1 TeV (the Uy (1) electroweak hypercharge group is not the same as the 
U(1) electromagnetic group). The group SU(N) is the special unitary group in N dimensions, the group of 
N-dimensional unitary matrices of unit determinant. The colour group is SU(3). 

The gravitational force is likewise a gauge force. The gravitational group is the group of spacetime trans- 
formations, also known as the Poincaré group, which is the product of the 6-dimensional Lorentz group of 
rotations and the 4-dimensional group of translations.! 

It is quite remarkable that so much of physics is captured by so simple a mathematical structure as a 
group of symmetries. During the 1980s there was hope that perhaps all of physics might be described by 
some theory-of-everything group, and all that was left to do was to discover that group and figure out its 
consequences. That hope was not realised. 

Gravity has been at the heart of the problem. Whereas the three other forces are successfully described by 
renormalizable quantum field theories, albeit equipped with a large number of seemingly arbitrary param- 
eters, gravity has resisted quantization. Currently the most successful (some would dispute that adjective) 


1 Technically the Poincaré group refers to the global symmetries of Minkowski space, where rotations do not commute with 
translations (rotation followed by translation yields a different result from translation followed by rotation). The Poincaré 
group is said to be a semi-direct product of rotations and translations. In general relativity, the translation group of 
Minkowski space is replaced by general coordinate transformations, which commute with local Lorentz transformations. In 
general relativity, coordinate transformations should be thought of as simply relabelling coordinates while leaving the 
underlying physical spacetime unchanged; and similarly local Lorentz transformation should be thought of as changing the 
tetrad axes with respect to which the locally inertial frame is measured, again while leaving the underlying spacetime 
unchanged. 
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theory of quantum gravity is string theory, or more specifically superstring theory (which includes spin-4 
particles), or more specifically some enveloping theory that contains not only strings, 1-dimensional objects 
sweeping out 2-dimensional worldsheets, but also fundamental objects, branes, of other dimensions. The 
topic of string theory is beyond the scope of this book. Suffice to say that string theory is apparently a much 
larger and richer theory than a putative theory of just our Universe. The good thing is that string theory 
(probably) contains the laws of physics of our Universe. The embarrassing thing (embarras de richesses, 
advocates would say) is that string theory (probably) contains many other possible laws of physics. This has 
led to the conjecture that our Universe is just one of a multiverse of universes with different sets of laws 
of physics. Such ideas are fascinating, but at present distanced from experimental or observational reality. 
String theory remains work in progress. 


This Chapter starts by applying the action principle to the simple example of an unspecified template 
field vy, deriving the Euler-Lagrange equations in §16.1, and Hamilton’s equations in §16.2. 


The Chapter goes on to apply the action principle to the simplest example of a gauge field, the electromag- 
netic field, first in index notation, §16.5, and then in the more difficult but powerful language of differential 
forms, §16.6. The electromagnetic example brings out features that appear in more complicated form in 
the gravitational field. Notably, the covariant equations of motion for the electromagnetic field resolve into 
genuine equations of motion for physical degrees of freedom, constraint equations whose ongoing satisfaction 
is guaranteed by conservation laws arising from gauge symmetries, and identities that define auxiliary fields 
that arise in a covariant treatment. 


The Chapter then proceeds to apply the action principle to the Hilbert (1915) Lagrangian to derive the 
equations of motion of gravity, namely the Einstein equations, along with equations for the connection 
coefficients. The tetrad-frame approach followed in this Chapter makes manifest the dependence of Hilbert’s 
Lagrangian on the two distinct symmetries of general relativity, namely symmetry with respect to local 
Lorentz transformations, and symmetry with respect to general coordinate transformations. 


The Chapter treats the gravitational action using three different mathematical languages, progressing 
from the more explicit to the more abstract. The first approach, starting at §16.7, lays out all indices 
explicitly. The second approach, §16.13, uses multivectors. The final approach, §16.14, uses multivector- 
valued differential forms. The multivector forms notation provides an elegant formulation of the definitions 
of curvature and torsion, equations (16.208) and (16.212), first formulated by Cartan (1904), and elegant 
versions of the equations of motion (16.250) that govern them. The dense, abstract notation can be hard to 
unravel (which is why more explicit approaches are helpful), but offers the clearest picture of the structure 
of the gravitational equations. A clear picture is essential both from the practical perspective of numerical 
relativity, and from the esoteric perspective of aspiring to a deeper understanding of the unsolved mysteries 
of (quantum) gravity. 


As expounded in Chapter 15, dtz denotes the invariant scalar 4-volume element, equation (15.80), while 
d*x°'!?3 denotes the pseudoscalar coordinate 4-volume element, the indices 0123 serving as a reminder that 
the coordinate 4-volume element is a totally antisymmetric coordinate tensor of rank 4. The two are related 
by a factor of the determinant e of the vierbein, dfx = ed*x°!?°, equation (15.88). 
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16.1 Euler-Lagrange equations for a generic field 


Let p(x”) denote some unspecified classical continuous field defined throughout spacetime. The least action 
principle asserts that the equations of motion governing the field can be obtained by minimizing an action 
S, which is asserted to be an integral over spacetime of a certain scalar Lagrangian. The scalar Lagrangian is 
asserted to be a function L(y, Du) of “coordinates” which are the values of the field y(x”) at each point of 
spacetime, and of “velocities” which are the torsion-free covariant derivatives Dig of the field. The torsion- 
free covariant derivatives are prescribed because application of the least action principle involves integration 
by parts, and, as established in Chapter 15, equations (15.91) or (15.99), it is precisely the torsion-free 
covariant derivative that can be integrated to yield surface terms. 

It should be commented that in the case of spinors Y, the Lagrangian can be considered to be a function 
L(y, D,,w) of the spinor field Y% and its torsion-full covariant dervative D,Y, since Gauss’ theorem occurs in 
a form (40.21) where the contortion contribution vanishes on integration by parts. The action principle for 
spinor fields is deferred to Chapter 41. 

The Lagrangian L(y, Diy) is actually a function of functions. Mathematicians refer to such a thing as 
a functional. Derivatives of a functional with respect to the functions it depends on are called functional 
derivatives, or variational derivatives, and are denoted with a 6 symbol. For example, the derivative of the 
functional L with respect to the function ọ is denoted 6L/dy. 

Least action postulates that the evolution of the field is such that the action 


ve . 
S -| L(y, Duy) d'z (16.1) 
ri 


takes a minimum value with respect to arbitrary variations of the field, subject to the constraint that the field 
is fixed on its boundary, the initial and final surfaces. The integral in equation (16.1) is over 4-dimensional 
spacetime between a fixed initial 3-dimensional hypersurface and a fixed final 3-dimensional hypersurface, 
labelled respectively A; and A¢. The variation ôS of the action with respect to the field and its derivatives is 


rey? ôL 2 
ri op 6(Duy) 
Linearity of the covariant derivative, 
Duly + öp) = Duy + Dlg) , (16.3) 


implies that the variation of the derivative equals the derivative of the variation, 5(Duw) = D, (59). Define 
the canonical momentum x conjugate to the field y to be 
ôL 
Ò(Dup) 


The second term in the integrand of equation (16.2) can be written 


m"6(Duw) = Du (rsp) — (Dur dy , (16.5) 
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The first term on the right hand side of equation (16.5) is a torsion-free covariant divergence, which integrates 
to a surface term. With the second term in the integrand of equation (16.2) thus integrated by parts, the 
variation of the action is 


MO ie 2s ae 
s= |f msoda| + f (F - ba) dea'e=0. (16.6) 
Ài ri op 


The surface term in equation (16.6), which is an integral over each of the three-dimensional initial and 
final hypersurfaces, vanishes since by hypothesis the fields are fixed on the initial and final hypersurfaces, 
dvi = pf = 0. Consequently the integral term must also vanish. Least action demands that the integral 
vanish for all possible variations dy of the field. The only way this can happen is that the integrand must 
be identically zero. The result is the Euler-Lagrange equations of motion for the field, 

ôL 


All of the above derivations carry through with the field y replaced by a set of fields y;, with conjugate 
momenta 7’ = 5L/65(D,,y;). The index i could simply enumerate a list of fields, or it could signify the 
components of a set of fields that transform into each other under some group of symmetries. 


16.2 Super-Hamiltonian formalism 


The Lagrangians L of the fields that Nature fields turn out to be writable in super-Hamiltonian form 


L=r"D,y—H, (16.8) 

in which the super-Hamiltonian H(y, 7”) is a scalar function of the field y and its conjugate momenta n”, 
defined in terms of the Lagrangian by equation (16.4). 

Varying the action with Lagrangian (16.8) with respect to the field y and its conjugate momenta m” gives 


At o o 
6S = I (rD,se + ôr" D e a ôn" T) d'r. (16.9) 
Ai dy Ome 


Integrating the first term in the integrand by parts brings the variation of the action to 


Àf Ag 3 g 
ôS = lf någ dan! +f |- (dB, + 5) dp + ôT" (Due = z) da . (16.10) 
ri Xi dp ont 


The principle of least action requires that the variation vanish with respect to arbitrary variations dy and 
dm" of the field and its conjugate momenta, subject to the condition that the field is held fixed on the initial 
and final hypersurfaces. The result is Hamilton’s equations of motion, 

ôH > ôH 


aa 2 
Dt" = oh Die = gn : (16.11) 


Hamilton’s equations (16.11) for the field y can be compared to Hamilton’s equations (4.72) for particles. 
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16.3 Conventional Hamiltonian formalism 


The conventional Hamiltonian is not the same as the super-Hamiltonian. In the conventional Hamiltonian 
formalism, the coordinates x” are split into a time coordinate t and spatial coordinates x“. The momentum 
T conjugate to the field ọ is defined to be 
ôL 
T= ——. (16.12) 
6(Di¥) 


The conventional Hamiltonian H is defined in terms of the Lagrangian L by 
H=rDy-L. (16.13) 


In the context of general relativity, the covariant super-Hamiltonian approach to fields is, as in the case of 
point particles, §4.10, simpler and more natural than the non-covariant conventional Hamiltonian approach. 
Indeed, the most straightforward way to implement the conventional Hamiltonian approach is to use the 
super-Hamiltonian approach, and then carry out a 3+1 split into space and time coordinates at the end, 
rather than doing a 3+1 split at the outset. 


16.4 Symmetries and conservation laws 


Associated with every symmetry is a conserved quantity. The relation between symmetries and conserved 
quantities is called Noether’s theorem (Noether, 1918), equations (16.17) and (16.18). Examples of 
Noether’s theorem include local electromagnetic gauge symmetry implying conservation of electric charge 
(§16.5.6), local Lorentz symmetry implying conservation of angular-momentum (§16.11.1), and general co- 
ordinate transformations implying conservation of energy-momentum (§16.11.2). 

All four of the known forces of Nature, including gravity, arise from local symmetries, in which the La- 
grangian is invariant under symmetry transformations that are allowed to vary arbitrarily over spacetime. 
Commonly, such transformations change not just one field, but multiple fields at the same time. However, 
the Lagrangian of an individual field may by itself be symmetric, to the extent that the field does not inter- 
act with other fields. For example, the local gauge symmetry of electromagnetism changes simultaneously 
the electromagnetic field and all charged fields, and that symmetry implies the law of conservation of total 
electric charge. However, an individual field, such as an electron field or a proton field, may individually 
conserve charge, to the extent that the field does not interact with other fields. 

Consider varying the template field y(x) by a transformation with a prescribed shape dy(a) as a function 
of spacetime, 


p(x) > v(x) + €dy(z) , (16.14) 
where e is an infinitesimal constant parameter. The torsion-free covariant derivatives Din? of the field 
transform correspondingly as 


Dmy > Dn + € Dm(5¢) . (16.15) 
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The torsion-free covariant derivative is prescribed for the reason explained at the beginning of §16.1. The 


o 


variation (16.14) is a symmetry of the field if the Lagrangian L(y, D,,y) is unchanged by it. The vanishing 
of the variation of the Lagrangian implies 


ôL 

ae 
ôL ôL s 

= ôy + — Dm(ô 
io ey 


o L o 
= Dm (n™ dy) + (= — By”) Oy , (16.16) 


with 7” the momentum conjugate to the field, equation (16.4). The Euler-Lagrange equation of motion (16.7) 
for the field implies that the second term on the last line of (16.16) vanishes. Consequently the current j™ 
defined by 


j” =n" dy (16.17) 


is covariantly conserved, 


o 


Dmj™ = 0). (16.18) 


The result (16.18) is Noether’s theorem. 


16.5 Electromagnetic action 


Electromagnetism is a gauge field based on the simplest of all continuous groups, the 1-dimensional unitary 
group U(1) of rotations about a circle. 


16.5.1 Electromagnetic gauge transformations 


Under an electromagnetic gauge transformation, a field y of charge e transforms as 
pare Hy , (16.19) 


where the phase 0(x) is some arbitrary function of spacetime. The charge e is dimensionless (in units c = h = 
1). The Lagrangian of the charged field y involves the torsion-free derivative Dig of the field. The torsion- 
free covariant derivative is prescribed for the reason explained at the beginning of §16.1. To ensure that the 
Lagrangian remains invariant also under an electromagnetic gauge transformation (16.19), the derivative D, 
must be augmented by an electromagnetic connection A„, which equals the thing historically known as the 
electromagnetic potential. The result is an electromagnetic gauge-covariant derivative D, +ieA, with 
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the defining property that, when acting on the charged field y, it transforms under the electromagnetic gauge 
transformation (16.19) as 

(D, + ieA,)y > e™® (D, + ieA, pe. (16.20) 
In other words, the gauge-covariant derivative of the field ọ is required to transform under electromagnetic 
gauge transformations in the same way as the field y. The gauge-covariant derivative D, + ieA,, transforms 


correctly provided that the gauge field A,, transforms under the electromagnetic gauge transformation (16.19) 
as 


Ay > Ay + Did (16.21) 


Since @ is a scalar phase, its covariant derivative reduces to its partial derivative, D0 = 06/dx". 


16.5.2 Electromagnetic field tensor 


The commutator of the gauge-covariant derivative D, +ieA,, defines the electromagnetic field tensor Fuy, 
[D, +ieA,, Dy + ieA,] = ieF yy . (16.22) 


The electromagnetic field F „y has the key property that it is invariant under an electromagnetic gauge 
transformation (16.19), in contrast to the electromagnetic potential A, itself. Explicitly, the electromagnetic 
field F,» is, from equation (16.22), 


Fav = D, Ay — D, A, 
_ ðA, OA, 
— Oat! Ox’ ’ 


the second line of which follows because the coordinate connections cancel in a torsion-free covariant co- 


(16.23) 


ordinate curl, equation (2.72). The expression on the second line of equations (16.23) is invariant under 
an electromagnetic gauge transformation (16.21) thanks to the commutation of coordinate derivatives, 
0°6/dx"Ox" — 070/dx" Ix" = 0, so the electromagnetic field F,,, is electromagnetic gauge-invariant as 
claimed. If the torsion-free derivative D,, in equation (16.23) were replaced by the torsion-full derivative D,,, 
then the electromagnetic field F,» would not be electromagnetic gauge-invariant. 


16.5.3 Source-free Maxwell’s equations 


For brevity, denote the electromagnetic gauge-covariant derivative by D, = D,, +ieA,,. The gauge-covariant 
derivative satisfies the Jacobi identity 


[Da (Pu, Pul] = 0 - (16.24) 
The electromagnetic Jacobi identity (16.24) implies that 


Dy Fw + D, Fa F D, Fy =0. (16.25) 
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Since the torsion-free coordinate connections cancel in such an antisymmetrized expression, equation (16.25) 
can also be written 
OF ww i OF,» OP yy 


ar a (16.26) 


Equations (16.26) constitute a set of 4 equations comprising the source-free Maxwell’s equations. 


16.5.4 Electromagnetic Lagrangian 


The electromagnetic action Se is 


Àf 
S= J Late, (16.27) 
Ài 
with electromagnetic Lagrangian 
Le = : PRY F, (16.28) 
e= 167 pv |3 : 


where F, is the electromagnetic field tensor defined by equation (16.23). The electromagnetic Lagrangian 
Le, equation (16.28) is, as required, a scalar with respect to electromagnetic gauge transformations (16.21), 
as well as with respect to coordinate and tetrad transformations. The justification for the choice (16.28) is 
that it reproduces Maxwell’s equations, which have ample experimental verification. The Lagrangian (16.28) 
is normalized to Gaussian units. High-energy physicists commonly used Heaviside units (SI units with £ọ = 
Ho = 1), for which the normalization factor is 1/4 instead of 1/(167). 

The momenta conjugate to the electromagnetic coordinates A, are, modulo a factor, the electromagnetic 
field components F””, 

OLe 1 


_ pee 16.29 
6(D,, Av) An ( ) 


In Heaviside instead of Gaussian units, the factor is 1 instead of 4r, which explains why high-energy theorists 
prefer Heaviside units. 


In the presence of electrically charged matter, the matter action generically contains an interaction term 
Sq 


Af 
S= f Lode , (16.30) 
with interaction Lagrangian Ly taking the form 


Lg = Avj” , (16.31) 


where j” is the electric current vector. 
The combined electromagnetic and charged matter action S = Se + Sq is, with the Lagrangian expressed 
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as required in terms of the electromagnetic coordinates A, and their velocities Dyfi, 


s= ‘ |- a (De av - DA") (DuA, - DvAy) + JA, d'e . (16.32) 


Varying the action (16.32) with respect to the electromagnetic coordinates A, and their velocities Dy Au, 
along the same lines as equations (16.2)—(16.6) for the template field y, yields 


1 Àf 1 Af é 
s=- | f F'”5A, ée,| +5 f (DPM + 4rj”) 8A, d's . (16.33) 
ri i 


Least action requires that the variation of the action with respect to arbitrary variations 6A, be zero, subject 
to the constraint that the field is fixed on the boundary of integration, 6A, = 0. The resulting Euler-Lagrange 
equations (16.7) are 


D,, FY" = nj” . (16.34) 


The factor 47 disappears if Heaviside units are used in place of Gaussian units. The Euler-Lagrange equa- 
tions (16.34) constitute 4 equations comprising the source-full Maxwell’s equations. 


16.5.5 Electromagnetic super-Hamiltonian 


The electromagnetic Lagrangian (16.28), coupled with the charged matter interaction Lagrangian (16.31), is 


in super-Hamiltonian form Le + Ly = p"0,,q — H with coordinates q = A, and momenta p = —F"” /4r, 
1 o 
Le = —-— F" D, A, -H, 16.35 
AT H ( ) 
and super-Hamiltonian H 
1 
H=———F"’F,,—A,j” . 16.36 
lor" ie 


The Hamiltonian (16.36) looks like the Lagrangian but with a flip of the sign of the interaction term A,j”. 
The electromagnetic Hamiltonian (16.36) is expressed as required in terms of the coordinates A, and the 
momenta FF”, 

Varying the action with Lagrangian (16.35) with respect to the coordinates A, and momenta F”” gives 


1 


~ ar 


58 J (— F" D ðA, — SF" D A, + 26F YY Fy, + 4Tj 8A) d'2 . (16.37) 
Integrating the first term in the integrand of equation (16.37) by parts yields 
1 HV 3 1 B pV o7 1/7 B py | J4 
65 =-— f FM 5A, day, + = | (D PM + 4nj”)5A, — (Dy Av — DuA, + Fu)5FM| d's. (16.38) 


The surface term vanishes provided that the electromagnetic coordinates A, are held fixed on the boundary. 
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Requiring that the variation of the action vanish with respect to arbitrary variations 6A, and dF” of the 
coordinates and momenta then yields Hamilton’s equations, 


o 


D, F”! = 4rj” , (16.39a) 

D, A, — D Ap = Fw - (16.39b) 

The first Hamilton equation (16.39a) reproduces the Euler-Lagrange equation (16.34) obtained in the La- 
grangian approach. The second Hamilton equation (16.39b) implies, as an equation of motion, the rela- 


tion (16.23) between the field F, and the derivatives of A, that was simply assumed in the Lagrangian 
approach. 


16.5.6 Electric charge conservation 


Maxwell’s source-full equations (16.34) enforce covariant conservation of electric charge j”, 


o 


Dj” =0. (16.40) 


At a more profound level, the conservation of electric charge is a consequence of symmetry with respect to 
electromagnetic gauge transformations. Under an electromagnetic gauge transformation, the field A, varies 
as, equation (16.21), 


6A, = D,0. (16.41) 


There are many distinct electrically charged fields in nature (for example, electrons and protons), and the 
action for each distinct charged field is electromagnetic gauge-invariant (absent interactions that create or 
destroy charged fields). The variation of a charged matter field under an electromagnetic gauge transforma- 
tion (16.19) is 


6S, = J j D, 0 d*e . (16.42) 
Integrating equation (16.42) by parts gives 


S, = froda, = fbioa . (16.43) 


Electromagnetic gauge-invariance requires that the variation vanish with respect to arbitrary choices of the 
gauge parameter 0, subject to the condition that 0 is fixed on the boundary. Covariant conservation of electric 
charge follows, 


D, j” =0. (16.44) 


The charge conservation law (16.44) is an example of Noether’s theorem (Noether, 1918), which relates 
symmetries and conservation laws. 
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16.5.7 Electromagnetic wave equation 


Eliminating F,» from Hamilton’s equations (16.39) yields a second order differential equation for the elec- 
tromagnetic potential A,, 


— OA, + Ra A` + DD, AY = rj, , (16.45) 


where Ê = D,D” is the torsion-free d’Alembertian. The last term D,D, A" on the left hand side of equa- 
tion (16.45) may be eliminated by imposing the Lorenz (not Lorentz!) gauge condition D, A” = 0. Equa- 


tion (16.45) is a wave equation with the torsion-free Ricci tensor R,\ acting as an effective potential, and 
the electromagnetic current j, acting as a source. 


16.5.8 Space+time (3+1) split of the electromagnetic equations 


In Chapter 4 it was found that, applied to point particles, the action principle yielded equal numbers of co- 
ordinates and momenta, and Hamilton’s equations supplied first order differential equations determining the 
evolution of each and every one of the coordinates and momenta. This was true in both the super-Hamiltonian 
and conventional Hamiltonian approaches, where Hamilton’s equations were respectively equations (4.72) 
and (4.75). 

Applied to fields, the super-Hamiltonian approach does not yield equal numbers of coordinates and mo- 
menta, and Hamilton’s equations cannot be interpreted straightforwardly as equations of motion for each 
and every one of the coordinates and momenta. For example, in the electromagnetic case, the first set of 
Hamilton’s equations (16.39a) apparently constitute 4 equations for 6 momenta F*”#, while the second set 
of Hamilton’s equations (16.39b) apparently constitute 6 equations for 4 coordinates A,. The mismatch 
of numbers of equations is not a practical barrier to solving Hamilton’s equations of motion. Hamilton’s 
equations (16.39) comprise 10 equations for 10 unknowns. If, for example, the 6 equations (16.39b) are in- 
terpreted not as first order differential equations of motion for the coordinates A,, but rather as defining the 
6 momenta Fv, then eliminating the momenta yields a set of 4 second order differential wave equations for 
the 4 coordinates A,, equation (16.45) (see §27.6 for further exposition). Treating the 6 equations (16.39b) 
as identities is the same as reverting to the Lagrangian, or second order, approach. 

It is nevertheless desirable to attain a better understanding of the first order Hamiltonian formalism 
for fields, partly so as to understand how to integrate the field equations numerically, and partly because 
quantization of fields, as usually implemented, requires identifying the physical degrees of freedom in a 
matching number of fields and their conjugate momenta. 

The problem of mismatching numbers of coordinates and momenta in the super-Hamiltonian formalism 
arises because symmetry under general coordinate transformations means that different configurations of 
fields are symmetrically equivalent. The covariant super-Hamiltonian description contains more fields than 
there are physical degrees of freedom. 

Dirac’s (1964) solution to the mismatch of numbers of equations is to break general covariance by splitting 
spacetime into space and time coordinates, and to interpret only the equations involving time derivatives 
of the fields as genuine equations of motion, while the remainder of the equations, those not involving 
time derivatives, are “constraints,” relations between the fields that serve to remove the redundant degrees 
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of freedom. In the relativist community, the term “constraint” is commonly used to describe an equation 
which must be arranged to be satisfied in the initial conditions, but which is guaranteed thereafter by some 
conservation law. Some of Dirac’s constraint equations, which Dirac calls “first-class constraints,” are of 
this character, but others, which Dirac calls “second-class constraints,” are identities that effectively define 
some fields in terms of others. This book follows the relativists’ convention that a constraint is an equation 
whose ongoing satisfaction is guaranteed by a conservation law, a first-class constraint. Dirac’s second-class 
constraint equations will be called identities. 

Suppose then that the coordinates are split into time and space components, x” = {t, x°}. In electro- 
magnetism, the Hamilton’s equations (16.39) involving time derivatives of the coordinates and momenta are 


3 equations of motion: D,F°' + DaF? = 4nj* , (16.46a) 
3 equations of motion: Di Aa — Da Ai = Fia - (16.46b) 


Equation (16.46a) comprises 3 equations of motion for the 3 momenta F, while equation (16.46b) comprises 
3 equations of motion for the 3 coordinates Aa. The physical degrees of freedom are thus identified as the 
3 spatial coordinates A, and their 3 conjugate momenta F°t, which comprise the 3 components E® = F'® 
of the electric field. The remaining electromagnetic Hamilton’s equations (16.39), those not involving time 
derivatives of the coordinates and momenta, are 


1 constraint: DF” =Anj' , (16.47a) 
3 identities: DaAg —DgAo = Fag . (16.47) 


The first equation (16.47a) has the property that, as long as the equation is satisfied on the initial spatial 
hypersurface, then conservation of electric charge ensures that the equation continues to be satisfied there- 
after. Of course, in numerical computations charge is conserved only so long as the equations of motion of 
charged matter are chosen such as to conserve electric charge, as they should be. If the matter equations 
conserve charge, then the constraint equation (16.47a) is redundant, but provides a numerical check that 
electric charge is being conserved. 

The second set of equations (16.47b) are identities relating the 3 purely spatial components Fag, which 
comprise the 3 components B° = Etab Foy of the magnetic field, to the spatial curl of the spatial coordinates 
Aa. Since the equations of motion (16.46b) already determine completely the spatial coordinates Aa, the 
identities (16.47b) cannot be independent equations, but must be interpreted as defining the magnetic field as 
an auxiliary field that does not represent additional physical degrees of freedom. The magnetic field is needed 
as part of the equations of motion, the second term on the left hand side of the equation of motion (16.46a). 
The magnetic field could be discarded after having been replaced by the curl of Aa in accordance with 
the identity (16.47b); but the magnetic field is part of the covariant 4-dimensional electromagnetic field 
tensor Fv, and discarding the magnetic field would obscure the covariant structure of the electromagnetic 
equations. 
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16.6 Electromagnetic action in forms notation 


Especially in the mathematical literature, actions are often written in the compact notation of differential 
forms, $15.6. The advantage of forms notation is not that it makes calculations any easier, but rather that 
it reveals the structure of the action unburdened by indices. Once one gets over the language barrier, forms 
notation can be a powerful clarifier. 

In this section 16.6, implicit sums are over distinct antisymmetric sequences of indices, since this removes 
the ubiquitous factorial factors that would otherwise appear. 


16.6.1 Electromagnetic potential and field forms 
The electromagnetic potential 1-form A and field 2-form F' are defined by 
A= A dr”, (16.48a) 
F= Fp P, (16.48b) 
where in the case of F the implicit summation is over distinct antisymmetric pairs uv of indices. With the 


electromagnetic gauge-covariant derivative 1-form denoted D = (D, +ieA,,)dx" for brevity, the field 2-form 
F is defined by the commutator of the gauge-covariant derivative, 


[D,D] =ieF . (16.49) 
Equation (16.49) implies that the field 2-form F is the exterior derivative of the potential 1-form A, 
OA OA 
F=dA=(|—~- —*) Pa” 16.50 
( Or! Ox” ) oe" ( ) 


implicitly summed over distinct antisymmetric pairs uv of indices. 


16.6.2 Electromagnetic potential and field multivectors 


When working with forms, it is often easier to do calculations in multivector language. In multivector 
language, the electromagnetic potential is a vector A, while the electromagnetic field is a bivector F, 


A= Any", (16.51a) 
F = Fmn Y” Ay" , (16.51b) 


with in the case of F implicit summation over distinct antisymmetric pairs mn of indices. The field F, 
equation (16.50), is in multivector language the torsion-free covariant curl of the potential A, 


F=DAA. (16.52) 


In multivector language, the combined electromagnetic (16.28) and charged interaction (16.31) Lagrangian 


is the scalar 


1 


Le+Lq= F: F+A;j, (16.53) 
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where j = jn y” is the electric current vector. The action is 
S= fik + La) d'r. (16.54) 


Recall that the scalar volume element dfx that goes into the action (16.54) is really the dual scalar 4- 
volume *d+xr, equation (15.80). To convert to forms language, the Hodge dual must be transferred from the 
volume element to the integrand. In multivector language, the required result is 


(a- b)- “d*x = (a - b) - (I d*x) = ((a- b)I) - d'x = (I(a - b)) - d'z = ((Ia) ^b) -d'z , (16.55) 


where the second expression is the definition (15.80) of the dual volume element, the third expression is an 
application of the multivector triple-product relation (13.39), the fourth holds because a - b is a scalar and 
therefore commutes with the pseudoscalar J, and the last expression is another application of the triple- 
product relation (13.39). The action (16.54) is thus, in multivector language, 


s= | (EUP) +a) -d'z . (16.56) 


16.6.3 Electromagnetic Lagrangian 4-form 


In forms notation, the action (16.56) is 
S = fi + Lgs (16.57) 
with Lagrangian 4-form 
Let Iq = FAF HAA. (16.58) 


Here A and F are the potential 1-form and field 2-form defined by equations (16.48). The symbol * denotes 
the form dual, equation (15.79). The dual *F is a 2-form, while the dual *j is the 3-form dual of the 1-form 
electric current 7 = jy dz”. 


16.6.4 Electromagnetic super-Hamiltonian 4-form 


The Lagrangian 4-form (16.58) is in super-Hamiltonian form p ^ dq— H with coordinates q = A and momenta 
p = *F/4r, 


1 
Le + La = 7 "FAdA—H, (16.59) 
T 
and super-Hamiltonian 4-form 
1 
H=; FAF- ‘GMA. (16.60) 
T 


The variation of the action with Lagrangian (16.59) with respect to the coordinates A and momenta *F is 


1 
55 = | *PAGSA 45°F AdA~S°FAF + An" 5A. (16.61) 
T 
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Integrating the *F \ ddA term in equation (16.61) by parts brings the variation of the action to 


5S = af “FASA+ a | CUP ard) ASA +F NGA - P) . (16.62) 
T ri 


Requiring that the variation of the action vanish with respect to arbitrary variations 6A and 6*F' of the 
electromagnetic coordinates and momenta, subject to the condition that A is fixed on the boundary, yields 
Hamilton’s equations, 


d*F =4r*j |, (16.63a) 


dA =F]. (16.63b) 


The first Hamilton equation (16.63a) is a 3-form with 4 components comprising Maxwell’s source-full equa- 
tions. The second Hamilton equation (16.63b) is a 2-form with 6 components that enforce the relation (16.50) 
between the electromagnetic field F and the electromagnetic potential A that is assumed in the Lagrangian 
formalism. 

Taking the exterior derivative of the first Hamilton equation (16.63a) yields, since d? = 0, the electric 
current conservation law 


d*j =O]. (16.64) 


Taking the exterior derivative of the second Hamilton equation (16.63b) yields 
dF =0, (16.65) 


which comprises Maxwell’s source-free equations. 


16.6.5 Electromagnetic wave equation in forms notation 


As is common, it is easier to manipulate form equations by translating them into multivector language. In 
multivector language, the electromagnetic Hamilton’s equations (16.63) are 


D.F=-—4nj, (16.66a) 
D\A=F. (16.66b) 


Applying the multivector triple-product relation (13.40) gives the multivector identities (the torsion-free curl 
of A vanishes, equation (15.43), so DDA has only a vector part, no trivector part) 


DDA = D(DA)=D-(D\A)+DA(D-A) 
=(DD)A=(D-D)A+(DAD).A. (16.67) 


Eliminating F from Hamilton’s equations (16.66) then yields a second order differential equation for the 
electromagnetic potential A, 


—~DA-(DAD).A+D(D-A)=47j, (16.68) 
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where L] = D- D is the torsion-free d’Alembertian operator. Equation (16.68) is equation (16.45) expressed 
in multivector language. The last term on the left hand side of equation (16.68) can be made to vanish by 
imposing the Lorenz gauge condition D - A = 0, in which case equation (16.68) reduces to 


—~DA-(DAD)-.A=4nj, (16.69) 


or more simply 
— (DD)A = 4rj . (16.70) 


Equation (16.70) is a wave equation for the electromagnetic potential A, with source the electric current j. 


16.6.6 Space+time (3+1) split of the electromagnetic equations in forms notation 


As discussed in §16.5.8, the super-Hamiltonian approach yields different numbers of coordinates and mo- 
menta, and the resulting Hamilton’s equations are unbalanced. Hamilton’s equations (16.63) have the ap- 
pearance of first order differential equations of motion for the momenta and coordinates, but the first equa- 
tion (16.63a) is 4 equations for the 6 components of the momenta *F, while the second equation (16.63b) is 
6 equations for the 4 components of the coordinates A. 

The solution to the problem is, as in §16.5.8, to break general covariance by splitting spacetime into 
time and space coordinates, x” = {t, x“ }, and to interpret only those Hamilton’s equations involving time t 
derivatives as genuine equations of motion, while the remaining equations are either constraint equations or 
identities. 

In splitting a form a into time and space components, it is convenient to adopt a notation in which the 
form a; (subscripted t) represents all the temporal parts of the form, while the form aa (subscripted @) 
represents the remaining all-spatial components. The bars on the time and spatial indices t and @ serves 
to distinguish the forms a; = ata d?x'4 and aa = aaa d?x™ from their components a;4 and aaa. Thus a 
1-form a = a, dz” splits into 


a = az + aa = u dt + aa dz® , (16.71) 
while a 2-form a = a, dx“à splits into 
a = a; + aa = tta de® + aag de? , (16.72) 


implicitly summed over distinct sequences of indices. The time component of the exterior product. of two 
forms a and b is 

(a ^b) = az ^ ba + aa Ab; (16.73) 
with no minus signs, the minus signs from the antisymmetry of indices cancelling the minus signs from 


commuting dt through a spatial form. 
The electromagnetic field 2-form F splits as 


F = F; + Fy = Fia d£” + Foy dx" , (16.74) 
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whose time and space parts encode the electric and magnetic fields. The dual electromagnetic field 2-form 
*F splits as 


*F = *F; + *Fa = Etapy FÊ? da” + etapyF* da” , (16.75) 


whose time and space parts conversely encode the magnetic and electric fields. With the definitions E° = F*° 
and B®“ = efoPY Fin of electric and magnetic field components, the form expression (16.75) agrees with the 
equivalent multivector expression (14.63). 

The time components of Hamilton’s equations (16.63) comprise 3 equations of motion for the 3 spatial 
components *F of the momenta, which is the electric field, and 3 equations of motion for the 3 spatial 
components Aa of the coordinates, 


3 equations of motion: (d*F); = d,;*Fx + da “F; = 4r “ji , (16.76a) 
3 equations of motion: (dA); = d Aa + da Az = F}. (16.76b) 


The exterior time and space derivatives here are the 1-forms d; = dt0/0t, and da = dx* 0/dx°. Equa- 
tions (16.76) are the same as equations (16.46), but in forms notation in place of index notation. In translating 
the forms equations (16.76) into indexed equations (16.46), note minus signs that come from commuting dt 
through a spatial form, for example da Az = dx® 0/0x° dt A; = —OA;/Ox° drt. The remaining Hamilton’s 
equations (16.63), those not involving any time derivatives, are 


1 constraint: dy *Fy = 47*jJa , (16.77a) 
3 identities: d,Aa = Fa. (16.77b) 


In accordance with the relativists’ convention, an equation is a constraint if it must be arranged to be 
satisfied on the initial hypersurface t; of constant time, but is guaranteed thereafter by some conservation 
law. Equation (16.77a) is an example of such a constraint equation, in this case guaranteed by conservation 
electric charge. The 4-dimensional equation representing conservation of charge, 


d(d*F — 4n*j) = —4nd*7 =0, (16.78) 
becomes in a 3+1 split 
d; (d*F — 40*j), +d, (d*F — 4r *j); = 0. (16.79) 


The second term on the left hand side of equation (16.79) vanishes on the equation of motion (16.76a), so 
equation (16.79) reduces to 


d, (d*F — 4n*j), = 0. (16.80) 


If the spatial components (d*F — 4r *j)a are arranged to vanish on the initial spatial hypersurface of constant 
time, then the equation of motion (16.80) guarantees that those spatial components vanish thereafter. Pro- 
vided, of course, that the equations governing the charged matter are arranged to satisfy charge conservation, 
as they should. 

Equation (16.77b) on the other hand, which expresses the magnetic field Fa as the spatial curl of the 
spatial potential Aj, is a constraint in Dirac’s (1964) sense, but not in the relativists’ sense, since it is 
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not guaranteed by any conservation law. As in §16.5.8, this book follows the relativists’ convention that 
a constraint is an equation whose ongoing satisfaction is guaranteed by a conservation law, a first-class 
constraint. Dirac’s second-class constraint equations are called identities. 


16.6.7 3+1 split of the variation of the electromagnetic action 


The equations of motion (16.76) and constraint and identities (16.77) follow directly from splitting Hamilton’s 
equations (16.46) into time and space parts; but they can also be derived more fundamentally from splitting 
the variation (16.62) of the action into time and space parts, 


1 t 
6S = | f FAA] (16.81) 
AT ti 
1 f" 
Ge —(d*F — 4r *j)r ^ Aa — (d*F — 4r*j)a NO Ap + â*Fa A(dA— F) + ô*F;^A(dA — F)a . 
ti 


From this variation it can be seen that the equations of motion (16.76) arise from minimizing the action 
with respect to the 3 spatial coordinates Aa and 3 spatial momenta *Fs. The 1 constraint (16.77a) arises 
from minimizing the action with respect to the 1 time component Aç of the coordinates, and the 3 iden- 
tities (16.77b) from minimizing with respect to the 3 time components *F; of the momenta. Now A; is a 
gauge variable: it can be adjusted arbitrarily by an electromagnetic gauge transformation, 


Aj > A+ dð. (16.82) 


Minimizing the action with respect to the gauge variable A; yields the constraint equation (16.77a) that 
effectively expresses current conservation. 

The mere fact that A; can be be treated as a gauge variable does not mean that it must be treated as a 
gauge variable. Other gauge-fixing choices can be made; see §27.6 for further discussion of this issue. 

The time components *F; of the momenta constitute the magnetic field. The dual of *F; constitutes the 
spatial components of Fa. The magnetic field *F;, or equivalently its dual Fy, is not a gauge field (that 
is, it cannot be adjusted by a gauge transformation), but rather an auxiliary field that arises when the 
electromagnetic field is treated as a generally covariant 4-dimensional object. Minimizing the action (16.81) 
with respect to the magnetic field *F; determines its own components, the identities (16.77b). 


16.6.8 Conventional electromagnetic Hamiltonian 


The conventional Hamiltonian H is defined by 


H = —*Fs^diAa— L. (16.83) 


1 
At 
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The combined electromagnetic and charged interaction Lagrangian (16.59) can be written 


1 * * 
TF lde ( Fa \ Aj) + Fa Ad A 


— (d*F — 4r *j)a A Ag + *F;A(dA — F)a — 4 *Fa A F; + $ *F;^ Fa + 4r *jg^ Aa] . (16.84) 


L= 


Dropping the total derivative term da (*Fa ^ A;) from the Lagrangian (16.84), and inserting the rest into 
the defining equation (16.83) yields the conventional Hamiltonian 

1 
4r 
The first term in the Hamiltonian (16.85) is the constraint (16.77a) wedged with the gauge variable Az, 
while the second term is the identity (16.77b) wedged with the auxiliary field *F;, the magnetic field. Both 
terms vanish on the equations of motion. The third and fourth terms (“Fa \ F; — *F;^ Fa) /(87) go over 
to (E? + B?)/(87) dtz in flat space, and comprise the energy density of the electromagnetic field. The final 
term — j - A dfx is an interaction term. 

The conventional Hamiltonian (16.85) is a function of spatial coordinates Aa and their conjugate spatial 
momenta *F',, and also a function of the time components A; and *F; of the coordinates and momenta. The 
spatial derivatives d; Ag and da *F in the conventional Hamiltonian are to be interpreted as functions of the 
coordinates and momenta, not as separate degrees of freedom. One should think of Ag(#*) and *F3,(x%) 
as being infinite collections of fields indexed by the spatial position z“; the spatial derivatives of the fields 
are then effectively linear combinations of those fields. 

Varying the conventional Hamiltonian (16.85) with respect to Ag, Az, *Fa, and *F; recovers Hamilton’s 
equations (16.76) and(16.77). In executing the variation, the terms involving the varied derivatives 6(d, Aa) = 
d,6Aq and ô(da*Fs) = daô*Fa can be integrated by parts. 


H = — |(d*F — 4r *j), A Az — *F;A(dA — F)a + 4 *Fa A F; - 34°F: A Fa — 4r “jpn Aa] . (16.85) 


16.7 Gravitational action 


As shown by Hilbert (1915) contemporaneously with Einstein’s discovery of the final, successful version of 
general relativity, Einstein’s equations can be derived by the principle of least action applied to the action 


Sg = 7 Lg d'z , (16.86) 
with scalar Hilbert Lagrangian 
1 
L = R 16.87 
8 167G |’ ( ) 


where R is the Ricci scalar, and G is Newton’s gravitational constant. The motivation for the Hilbert 
action (16.86) is that the Ricci scalar R is the only non-vanishing scalar that can be constructed linearly 
from the Riemann curvature tensor Rkimn- 

Least action requires the Lagrangian to be written as a function of the “coordinates” and “velocities” 
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of the gravitational field. The traditional approach, following Hilbert, is to take the coordinates to be the 
10 components g v of the metric tensor. The gravitational Lagrangian Lg is then a function not only of 
the coordinates g,,, and their velocities 0g,,,/0x", but also of their second derivatives 87 94, /Ox"Ox. The 
presence of the second derivatives (“accelerations”) might seem problematic, but they can be removed into a 
surface term by integration by parts, leaving a Lagrangian that contains only first derivatives. 

A modified approach, with a different choice of “coordinates” for the gravitational field, brings out the 
Hamiltonian structure of the Hilbert Lagrangian, and makes transparent the dependence of the Hilbert 
Lagrangian on the two distinct symmetries underlying general relativity, namely general coordinate transfor- 
mations, and local Lorentz transformations. In terms of the Riemann tensor (11.76) (valid with or without 
torsion) written in a mixed coordinate-tetrad basis, the Hilbert Lagrangian (16.87) is (units c= G = 1) 


1 MK NA (Te OP mn p 
È 


1 
167 Ox" Ox Dial pn z ala) . (16.88) 


MK NÀ 
Lg = 16r ee” Reymn = 


As usual in this book, greek (brown) indices are coordinate indices, while latin (black) indices are tetrad 
indices in a tetrad with prescribed constant metric Ymn. If the tetrad is orthonormal, then the tetrad metric 
is Minkowski, Ymn = mn, but any tetrad with constant metric Ym», such as Newman-Penrose, will do. The 
Lagrangian (16.88) manifests the dependence of the gravitational Lagrangian on coordinate transformations, 
encoded in the 16 components of the inverse vierbein e””, and on Lorentz transformations, encoded in the 
24 connections Tyy»,. The connections Imne form a coordinate vector (index «) of generators of Lorentz 
transformations (antisymmetric indices mn), and they constitute the connection associated with a local gauge 
group of Lorentz transformations. The Lorentz connections [,,,,, are sometimes called “spin connections” in 
the literature. In a local gauge theory such as electromagnetism or Yang-Mills, the connections T',,,,, would 
be interpreted as the “coordinates” of the field. 

The mixed coordinate-tetrad expression for the Riemann tensor R,\mn on the right hand side of equa- 
tion (16.88) is not the same as the coordinate expression (2.112), despite the resemblance of the two ex- 
pressions. There are 24 Lorentz connections Ty,,,, but 40 (without torsion, or 64 with torsion) coordinate 
connections Pys- It is possible — indeed, this is the traditional Hilbert approach — to work entirely with 
coordinate-frame expressions, the coordinate metric and the coordinate connections, without introducing 
tetrads. The advantage of the mixed coordinate-tetrad approach is that it makes manifest the fact that the 
Hilbert Lagrangian is invariant with respect to two distinct symmetries, coordinate transformations encoded 
in the tetrad, and local Lorentz transformations encoded in the Lorentz connections. Extremization of the 
Hilbert action with respect to the tetrad yields Einstein’s equations, with source the energy-momentum of 
matter. Extremization of the Hilbert action with respect to the Lorentz connections yields expressions for 
those connections in terms of the tetrad and its derivatives, with source the spin angular-momentum of 
matter. 

Whereas a purely coordinate approach to extremizing the Hilbert action is possible, a purely tetrad 
approach is not. In general relativity, tetrad axes ym(x”) are defined at each point x“ of spacetime. The 
coordinates x” of the spacetime manifold provide the canvas upon which tetrads can be erected, and through 
which tetrads can be transported. It is possible to do without tetrads by working with coordinate tangent 
axes e, and the associated coordinate connections, but it is not possible to do without coordinates. 
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If the Lorentz connections [mng are taken to be the coordinates of the gravitational field, then the 
corresponding canonical momenta are (a factor of 87 is inserted for convenience; or one could use units 
where 87G = 1 in place of the units G = 1 adopted here) 


mnKkr — 87 dL, 
~ O0(OPmnd/O2") 


= nee = ee) (16.89) 


mneX is antisymmetric in mn and in «xÀ, and as such apparently has 6 x 6 = 36 


The momentum tensor e 
components, but the requirement that it be expressible in terms of the vierbein in accordance with the right 
hand side of equation (16.89) means that the momentum tensor has only 16 independent degrees of freedom. 
The approach followed below, §16.8, is to treat the 16 components of the vierbein e”” as the independent 
degrees of freedom. (A possible approach, not followed here, is to work with the 36-component momentum 
mn instead of the 16-component vierbein, subjecting the momentum to the identities (constraints, 


in Dirac’s terminology) 


tensor e 


Sage = eklmn f (16.90) 


which is a symmetric 6 x 6 matrix of conditions, or 21 conditions, except that the normalization of £k\uv = 
—e|kAuv], where e is the vierbein determinant, is arbitrary, so equations (16.90) constitute a set of 20 distinct 
identities.) 

The gravitational Lagrangian (16.88) can be written 


1 mnKr dP mna 
Lg = ar (A (=e + Tel . (16.91) 
The Lagrangian (16.91) is in (super)-Hamiltonian form Lẹ = p“0,.q — He with coordinates q = mn) and 
momenta p“ = e™”*A /8r, 

1 x, OV mna 


tea age e (16.92) 
and (super-) Hamiltonian H; (Tmn er">) 
1 s 
He = — eT ama qns « (16.93) 


Since a coordinate curl is a torsion-free covariant curl, equation (2.72), the coordinate partial derivatives 
0/Ox" in the Lagrangian (16.91) or in the definition (16.89) of momenta could be replaced by torsion-free 
covariant derivatives Dp, as was done earlier in the case of the electromagnetic field, equation (16.35). 
The development below works with coordinate derivatives, but one could equally well choose to work with 
torsion-free covariant derivatives. 


16.7.1 The Lorentz connection is not a tetrad tensor, but any variation of it is 


The Lorentz connection Tmna = AL mn is a coordinate vector but not a tetrad tensor. Although the Lorentz 
connection is not a tetrad tensor, any variation of it with respect to an infinitesimal local Lorentz trans- 
formation of the tetrad is a tetrad tensor. Generators of Lorentz transformations are antisymmetric tetrad 
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tensors, Exercise 11.2. Under a local Lorentz transformation generated by the infinitesimal antisymmetric 
tensor Enm, a tetrad vector an varies as 


An > al, = An + Ôn = An + Enam - (16.94) 
The variation da, of the tetrad vector, 
bn = Enam , (16.95) 


is thus also a tetrad vector. The Lorentz connection is defined by Pinn\ = Ym + OYn/Ox*, equation (11.37). 
Its variation under an infinitesimal Lorentz transformation generated by the antisymmetric tensor Enm is 


ayn alen? Yp) OYn  Oenm 
bP mnà =6 (am ` a = Ym" Sa T Em” Yp , Ox = Ox ag €n? Pimpr “ar Em pnd 
= D)é€nm a coordinate and tetrad tensor . (16.96) 


Equation (16.96) shows that the variation ôl mnà is a covariant derivative of a tetrad tensor, therefore a 
coordinate and tetrad tensor. The variation of the Lorentz connection by a derivative under an infinitesimal 
Lorentz transformation is analogous to the variation 6A), = 0,0 of the electromagnetic potential A) by the 
gradient of a scalar 0 under a gauge transformation of an electromagnetic field. 

As a corollary, it follows that although the Hamiltonian H,, equation (16.93), is not a tetrad scalar, any 
variation of it with respect to an infinitesimal local Lorentz transformation is a scalar. 


16.8 Variation of the gravitational action 


The gravitational action Sẹ with the Lagrangian (16.91) is 


1 "i OD mn 
Sg = g fe (Fe + Dl) se (16.97) 


ME and the 24 Lorentz connections Imn are obtained 
by varying the action (16.97) with respect to these fields. As shown below, variation with respect to the 
vierbein e”” yields Einstein’s equations in vacuo, equation (16.105), while variation with respect to the 
Lorentz connections l'mn« recovers the torsion-free expression (11.54) for the tetrad-frame connections Dyn, 
equation (16.110). 

The approach of treating the vierbein and connections as independent fields to be varied is the Hamiltonian 
(as opposed to Lagrangian) approach. In the context of the Hilbert action, the Hamiltonian approach is 
commonly called the Palatini approach, after Palatini (1919), who first treated the 10 components of the 
coordinate metric gay and the 40 coordinate connections Iys as independent fields. 

Before the gravitational action is varied, the spacetime is a manifold equipped with coordinates x", but 
there is no prior coordinate metric g,,,, since the metric is determined by the vierbein, which remain un- 


specified until determined by the variation itself. Therefore, in varying the action, it is necessary to take the 
(4.0123 


Equations of motion governing the 16 vierbein e 


coordinate volume element , which is a pseudoscalar, as the primitive measure of volume. The scalar 
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volume element dfx is related to the pseudoscalar coordinate volume element by a factor of the determinant 
e of the vierbein, dx = ed*x°!?*, equation (15.88), and this determinant e must be varied when the vierbein 
are varied. 

Varying the action (16.97) with respect to the vierbein e™” and the Lorentz connections Py, yields 


1 MNKA IOV mnà mnkr OV mna = MNA 4,0123 
6S, = = | e = +e ôT? aL pns) + Jarr +I? Tonn |e slee )| edx : 
(16.98) 
To arrive at Hamilton’s equations, the first term of the integrand on the right hand side of equation (16.98) 
(the pô. (ðq) term) must be integrated by parts, which is accomplished by 
. OOF e™t8(e en^ mna) etae err") 
mnKr mn mn 
= OL iad 16.99 
i ox" ox" ox" A ( ) 
Since e™”*àST mna is a coordinate tensor (and also a tetrad tensor, equation (16.96)), the first term on the 
right hand side of equation (16.99) is a torsion-free covariant divergence in accordance with equation (2.74) 
(the ° atop D, is a reminder that it is torsion-free), 


a pee oT ana) 
Ox" 
and therefore integrates to a surface term in accordance with Gauss’ theorem (15.102). The remaining terms 
in the integrand of equation (16.98) must be expressed in terms of the variations ôl pn, and de”” of the 
connections and vierbein. The second term in the integrand on the right hand side of equation (16.98) is 


=D (e eli) 5 (16.100) 


ee (TE Tong) —92 een Ol oik : (16.101) 


The variation 6 Ine of the vierbein determinant e in may be written in terms of the variation de™" of the 
vierbein, equation (2.77), 


ôlne = —e€m,, de" . (16.102) 


The last term in the integrand on the right hand side of equation (16.98) is then 


OD mn f : i 
( + Talea) e'd(e ERA) _ Remin e 'd(e ems) = (Rem — tems R) SeT" = Game”, de™ , 


ox" 

(16.103) 
where Gem = Rkm — 5 Vkmk is the tetrad-frame Einstein tensor. The 5 Vkmi part of the Einstein tensor 
comes from variation of the vierbein determinant, equation (16.102). 

The substitutions (16.99)—(16.103) bring the variation (16.98) of the gravitational action to 


87 Sg = f E ORAT n Pr, 


—1 MnKA 
mal f ) parpela) SV mn + Gum de™| dêr . (16.104) 


The surface term vanishes provided that the connections I, are held fixed on the boundary of integration, 
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so that their variation ôl mna vanishes on the boundary. Hamilton’s equations follow from extremizing the 
remaining integral. Extremizing the action (16.104) with respect to the variation de" of the vierbein yields 
Einstein’s equations in vacuo, 
Gum =0. (16.105) 
Extremizing the action with respect to the variation ôl mnà of the Lorentz connections gives 
e 'd(e enn) 
Ox" 
Abbreviate the left hand side of equation (16.106) by 
ed(e EMRA) 
ox" : 
which is antisymmetric in its last two indices, fimn = fijmnj- In terms of the vierbein derivatives dimn defined 
by equation (11.33), the quantities fımn defined by equation (16.107) are 


en m n]pKkà 
21 enlra (16.106) 


fe (16.107) 


fimn = dimn] — imd" fen] + Vind" ikm] - (16.108 
Inverting equation (16.106) yields the tetrad-frame connections I'mn:ı in terms of fimn, 


Inserting the expression (16.108) into equations (16.109) yields the standard torsion-free expression (11.54 
for the tetrad-frame connection Imn: in terms of vierbein derivatives dyn, 


o 


Dmnl = Pmni = 2ditmn| = 3diimn] : (16.110 


The expression for the Ricci scalar in the Hilbert Lagrangian (16.88) is valid with or without torsion, but 
extremization of the action in vacuo has yielded the torsion-free connection. There remains the possibility 
that torsion could be generated by matter, §16.11. 


16.9 Trading coordinates and momenta 


In the Hamiltonian approach, the coordinates and momenta appear on an equal footing. A Lagrangian in 
Hamiltonian form L = p0q—H can be replaced by an alternative Lagrangian L’ = — q 0p — H which differs 
from the original by a total derivative, L’ = L — O(pq), and thus yields identical equations of motion. The 
alternative Lagrangian L’ is in Hamiltonian form with q —> p and p > —q. 

Consider integrating the first term of the gravitational Lagrangian (16.91) by parts (this is essentially the 
same integration by parts as (16.99), but with the connection [,,, itself instead of the varied connection 
OD mna), 

—1 mnk —1 mnkr 
emner ann senh o Imma) _ ¢ Mee ae (16.111) 


mn, is a coordinate tensor but not a tetrad tensor. However, its variation with respect to any 


Now eA. 
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infinitesimal Lorentz transformation is a tetrad tensor, §16.7.1. Therefore the variation of the first term 
on the right hand side of equation (16.111) is a torsion-free covariant divergence Dle mna), which 
can be discarded from the Lagrangian without changing the equations of motion. The resulting alternative 
gravitational Lagrangian is 


ed(e por) 
Ox" 


Again, this alternative Lagrangian is a coordinate scalar but not a tetrad scalar, but any variation of it is a 


81 Ly = Diana + OOO TE aT pno: (16.112) 


tetrad scalar, so is a satisfactory Lagrangian. 
In this alternative Lagrangian (16.112), the coordinates are the vierbein e”^, and the corresponding canon- 
ically conjugate momenta are 


a 81 OL, MK 
Tr) = ade" ax") =e ero ae (16.113) 
where Tym) and I, are related by 
Tam = Dd _ Gna tp F Gnd ip ; Prk = Tnm\ — 4 EnA Tmp + 5 Caan : (16.114) 


Like the tetrad connection Tnm, the covariant momentum 7pm is antisymmetric in its first two indices 


nm, and therefore has 6 x 4 = 24 independent components. The traces are related by thp = —2I%,,. The 
alternative Lagrangian (16.112) is in Hamiltonian form L, = p"0,q — Hg with coordinates q = e™ and 
momenta p° = 7p," /87, 
1 _ der 
Pa 
a z gg T^ ak o Hg l (16.115) 


and the same (super-)Hamiltonian (16.93) as before. 

Equations of motion come from varying the alternative action ôS; with respect to the coordinates e™® and 
momenta Tmn- The coefficients of the variations de” and damn are linear combinations of the coefficients 
of de™* and óT mna in the varied action of equation (16.104). The end result is the same equations of motion 
as before, equations (16.105) and (16.110). The only difference is that variation of the alternative action 
gives a revised surface term, 


87 ôS, = f Tame dr” + / as eq. (16.104) . (16.116) 


The surface term vanishes provided that the vierbein e™ is held fixed on the boundary. 


16.10 Matter energy-momentum and the Einstein equations with matter 


Einstein’s equations in vacuo, equation (16.105), emerged from varying the gravitational action with respect 
to the vierbein. Einstein’s equations including matter are obtained by including the variation of the matter 
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action with respect to the vierbein. The variation of the matter action Sm with respect to the vierbein defines 
the energy-momentum tensor T),,, of matter, 


65m = = f Tam de" d'r : (16.117) 


Adding the variation (16.104) of the gravitational action and the variation (16.117) of the matter action 
gives 


81 (6Sg + 55m) = J (Gem — 80 Tem) 6e™ d'z , (16.118) 


extremization of which implies Einstein’s equations in the presence of matter 


Giem = 8r Tem . (16.119) 


The Einstein equations (16.119) constitute a set of 16 equations. Conditions on the energy-momentum 
imposed by the invariance of the matter action under local Lorentz transformations and under coordinate 
transformations are discussed in §§16.11.1 and 16.11.2 below. 

If the matter action is Sm = f Lm d‘z, then the matter energy-momentum is the sum of a part from the 
variation of the matter Lagrangian Lm, and a part from the variation of the vierbein determinant in the 
scalar volume element dfr = e d*x°!?°, 


Lm 
Tim == + times (16.120) 


ems 


16.11 Spin angular-momentum 


In the standard Uy (1) x SU(2) x SU(3) model of physics, the connections associated with the gauge groups 
are dynamical fields, the gauge bosons, which include photons, weak gauge bosons, and gluons. As has been 
seen above, the gauge symmetries of general relativity include not only coordinate transformations, encoded 
in the vierbein e””, but also Lorentz transformations, encoded in the Lorentz connection Imn. Treating 
the vierbein as a dynamical field leads to Einstein’s equations (16.119) and standard general relativity. If the 
Lorentz connection is treated similarly as a dynamical field, as it surely should be, then the inevitable con- 
sequence is the extension of general relativity to include torsion, which is called Einstein-Cartan theory. 

Einstein-Cartan theory follows general relativity in taking the Lagrangian to be the Hilbert Lagrangian, the 
only difference being that the Lorentz connections T,,,, in the Riemann tensor are allowed to have torsion. 
The Riemann tensor with torsion equals the torsion-free Riemann tensor plus extra terms depending on the 
contortion, equation (15.49). Since torsion is a tensor, it is possible to include additional torsion-dependent 
terms in the Lagrangian (Hammond, 2002; Hehl, 2012; Blagojević and Hehl, 2013), but the various possible 
extensions go beyond the scope of this book. 

As shown below, in Einstein-Cartan theory, torsion vanishes in empty space, and it does not propagate as 
a wave, unlike the (trace-free, Weyl part of the) Riemann curvature. Consequently conventional experimental 
tests of gravity do not rule out torsion. The gravitational force is intrinsically much weaker than the other 
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three forces of the standard model. It makes itself felt only because gravity is long-ranged, and cumulative 
with mass. Since torsion in Einstein-Cartan theory is local, it is hard to detect. 

Just as the variation of the matter action with respect to the vierbein e™” defines the energy-momentum 
tensor Tkm, so also the variation of the matter action with respect to the Lorentz connections T mnà defines 
the spin angular-momentum tensor ©*’"”, 


Sm = Th OD mna dêz (16.121) 


(implicitly summed over both indices m and n). The spin angular-momentum tensor Xò™” is so-called because 
it is sourced by the spin of fermionic fields such as Dirac fields, Exercise 16.5. The spin angular-momentum 
vanishes for gauge fields such as the electromagnetic field, Exercise 16.4. Like the torsion tensor S)\mn, the 
spin angular-momentum ©) 7 is antisymmetric in its last two indices mn. Adding the variation (16.104) of 
the gravitational action and the variation (16.121) of the matter action gives 


—1 MNKAÀ 
87 (Sg + Sm) = I (- D i + 27imenlPr> 4 4r a) Tmn dx , (16.122) 


extremization of which implies 
e d(e een) 
ox" 
Inverting equation (16.123) along the lines of equations (16.106)-(16.110) recovers the usual expression (11.55) 


for the torsion-full tetrad connection I' mna as a sum of the torsion-free connection I'mna given by equa- 
tion (16.110), and a contortion tensor Kyyn), 


= 27m eMP 4 dg Sm (16.123) 


o 


Pmnà = Dinà + Kinnd ; (16.124) 


with the contortion tensor Kmnı being related to the spin angular-momentum Yymn by 


Dot =8r7 ( LImn 4 3 Omen “Vifm=” np) . (16.125) 


The contortion Kmnı is related to the torsion Smnı by equations (11.56). Equation (16.125) implies that the 
torsion S\mn is related to the spin angular-momentum mn by 


Sry = 8m (Dn + em Dh) (16.126) 


Equation (16.126) inverts to 


Shn + 2€[m* Shy, = 8TEmn | - (16.127) 


Equation (16.127) relating the torsion to the spin angular-momentum is the analogue of Einstein’s equa- 
tions (16.119) relating the Einstein tensor to the matter energy-momentum. Whereas the Einstein equa- 
tions (16.119) determine only 10 of the 20 components of the Riemann tensor (for vanishing torsion) leav- 
ing 10 components (the Weyl tensor) to describe tidal forces and gravitational waves, the torsion equa- 
tions (16.127) determine all 24 components of the torsion tensor in terms of the 24 components of the spin 
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angular-momentum. Thus, at least in this vanilla version of general relativity with torsion, torsion vanishes 
in empty space, and it cannot propagate as a wave. 

An equivalent spin angular-momentum tensor Amn is obtained by varying the matter action with respect 
tO Tmna in place of Pinna, 


Sn = i [Sm OT mn» dx . (16.128) 


The relation between the torsion SÀ 


An and the modified spin angular-momentum D is 


(16.129) 


Comparing equation (16.129) to equation (16.126) shows that the modified and original spin angular- 
momenta >, and Xà p differ by a trace term, 


Ehn = Ehn tem Diro Ehn = Ehn + 2em” Be (16.130) 


As seen above, the torsion, contortion, and spin angular-momentum tensors are all invertibly related to each 
other. The relations between them are conceptually clearer when decomposed into irreducible parts. Each is 
a 24-component tensor that decomposes into a 4-component trace part, a 4-component totally antisymmetric 
part, and a remaining 16-component trace-free antisymmetry-free part. The torsion Sımn, contortion Kimn, 
and spin angular-momentum “jn, package these parts with different weights. The three parts are related 
by 


st, =K, =—4nuk, =8nbk, trace part. , (16.131a) 
Sttmn] = 2K¢mni = 87 Upmn] = 81S tmn| totally antisymmetric part , (16.131b) 
Stmn =—Kmnt =8t2imn = 8tXimn trace-free, antisymmetry-free part . (16.131c) 


16.11.1 Conservation of angular-momentum and the symmetry of the 
energy-momentum tensor 


The action Sm of any matter field is invariant under Lorentz transformations. Symmetry under Lorentz 
transformations implies a conservation law (16.136) of angular-momentum. If torsion vanishes, the conserva- 
tion law (16.136) implies that the energy-momentum tensor T™” of the field is symmetric, equation (16.137). 
I thank Prof. Fred Hehl for pointing out that the antisymmetric part of the energy-momentum tensor can 
be interpreted consistently as half the divergence of orbital angular-momentum, §19(c) of Corson (1953), 
so that equation (16.136) can be interpreted as a conservation law of total angular momentum, spin plus 
orbital. 

Equation (16.95) gives the variation of a tetrad vector under a local Lorentz transformation generated by 
the infinitesimal antisymmetric tensor €mn. Under such an infinitesimal Lorentz transformation, the vierbein 
tensor e’” varies as 


Sen" = EM eM — Emr (16.132) 
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Equation (16.96) gives the variation of the Lorentz connection under an infinitesimal Lorentz transformation 
generated by Emn, 

The coefficients of the variation ôSm of the matter action with respect to de” and ôl mna are by definition 
the energy-momentum and spin angular-momentum of the matter, equations (16.121) and (16.117). Inserting 


the variations (16.132) and (16.133) with respect to Lorentz transformations yields the variation of the matter 
action under a Lorentz transformation, 


Sm = — I (12A Dyemn + Dome) d'z . (16.134) 
An integration by parts brings the variation to 
Sm = -$ iDa emn de> + / (4D, + rin) Emn d'a . (16.135) 


Requiring that the matter action be invariant under Lorentz transformations imposes that the variation 
(16.135) must vanish under arbitrary variations of the antisymmetric Lorentz generators €mn, subject to the 
generators being fixed on the initial and final hypersurfaces of integration. Therefore the integrand of the 
rightmost integral in equation (16.135) must vanish, implying the conservation law 


SDE 4 r™ =0]. (16.136) 


If the spin angular-momentum of the matter component vanishes, Xà™” = 0, then the energy-momentum 
tensor of the matter component is symmetric, 


T” =T™ , (16.137) 


16.11.2 Conservation of energy-momentum 


The action Sm of any matter field is also invariant under coordinate transformations. Symmetry under 
coordinate transformations implies a conservation law (16.145) for the energy-momentum T™” of the field. 

Under a coordinate transformation generated by the coordinate shift 67 = e”, the variation of any 
quantity is given by minus its Lie derivative Le with respect to the coordinate shift e”, equation (7.125). 
The Lie derivative of a coordinate tensor is given by equation (7.153), and this equation continues to hold 
for tensors that are tetrad as well as coordinate tensors, the tetrad components being treated as coordinate 
scalars (because tetrad components are unchanged under a coordinate transformation). However, a difficulty 
arises because the Lie derivative of a tetrad tensor is not a tetrad tensor (see Concept Question 26.2). 
Consequently, although the vierbein is a coordinate and tetrad tensor, its Lie derivative is a coordinate 
tensor but not a tetrad tensor. The solution to the difficulty is pointed out at the beginning of §5.2.1 of Hehl 
et al. (1995): the Lagrangian is a Lorentz scalar, so its coordinate derivative is also its Lorentz-covariant 
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derivative. Thus in varying the Lagrangian, the coordinate derivative of any tetrad tensor can be replaced 
by its Lorentz-covariant derivative. The Lorentz-covariant Lie derivative Lre of the vierbein is 


. Oc” Oe™* . 
i.e" oe" er ( + re" 


Ox Ox* 
— Dre" = €,k™> 
== DPE «9° , (16.138) 


which differs from equation (26.18) in that the derivative of e’” on the right hand side of the first line 
is covariant with respect to the tetrad index m. The expressions on the second and third lines of equa- 
tions (16.138) are equivalent; the second line is in terms of the torsion-free covariant derivative D, while the 
third line is in terms of the torsion-full covariant derivative D. Thus the vierbein tensor e” 
coordinate transformation as, equation (26.18), 


£ varies under a 


be™ = —Lpee™" = De + e,Kmer : (16.139) 


The Lorentz connection [,,, is not a tetrad-frame tensor, so the usual formula for the Lie derivative 
does not apply. Rather, the variation ôl mnà of the Lorentz connection follows from a difference of covariant 
derivatives, 


Dyan — Dy day = (an — TRV Am) — (aðan — rph fam) = —(OT Jam - (16.140) 
Thus the variation of the Lorentz connection under a coordinate transformation by e” satisfies 


Oc” 


_[{ O y 
(6D mn,)a”™ = Lye(D) an) — Dy LreGn =g" ——D)ay = T7 Dyam + (D,,Qn)—> = D(t Dran) 
Ox" Ox 
= 6° a Ryne (16.141) 
Equation (16.141) is true for arbitrary a™, so 
bP mn = e" Rycmn . (16.142) 


Inserting the variations (16.139) and (16.142) of the vierbein and Lorentz connection into the varia- 
tions (16.117) and (16.121) of the matter action yields the variation of the matter action under a coordinate 
transformation by €", 

Sm = / |- Ton (ore + km) +y” e Rana d'r . (16.143) 
An integration by parts brings the variation of the matter action to 


TAE f Toye" Bar + J (O Ein HTK mne HEE Bona) e d'z. (16.144) 


Invariance of the action under coordinate transformations requires that the variation (16.144) vanish for 
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arbitrary coordinate shifts e" that vanish on the boundary. Therefore the integrand of the rightmost integral 
in equation (16.144) must vanish, implying the law of conservation of energy-momentum, 


D" Tem +T" Kman +12” ee = 0 ||: (16.145) 


Since the contortion Kmng is antisymmetric in its first two indices mn, the second term of the conservation 
law (16.145) depends on the antisymmetric part T™”] of the energy-momentum tensor. 

If the spin angular-momentum of the matter component. vanishes, Xà™” = 0, then its matter energy- 
momentum tensor T™” is symmetric, equation (16.136), and the energy-momentum conservation equa- 
tion (16.145) of the matter component simplifies to 


Dy TOO. (16.146) 


Concept question 16.1. Can the coordinate metric be Minkowski in the presence of torsion? 
Can the coordinate metric be the Minkowski metric guv = nuv over a finite region of spacetime where torsion 
does not vanish? Answer. As discussed in Concept Question 2.5, yes, torsion could technically be finite even 
in flat (Minkowski) space. In practice, no, because torsion at any point of spacetime is determined by the 
spin angular-momentum of matter there, which contributes energy-momentum that ensures that the metric 
is not Minkowski over the finite region (of course, the metric can always be made locally Minkowski). 


Concept question 16.2. What kinds of metric or vierbein admit torsion? Answer. Any kind. 
Coordinate derivatives of the metric or vierbein determine torsion-free connections, placing no constraint on 
torsion. 


Concept question 16.3. Why the names matter energy-momentum and spin angular-momentum? 
What is the justification for calling Tym the matter energy-momentum and ©” the spin angular-momentum? 
Answer. In flat spacetime, conservation of energy and momentum are associated with translation symmetry 
with respect to time and space. Conservation of angular momentum is associated with rotational symme- 
try of space. In general relativity, these global symmetries are replaced by local symmetries. Translation 
symmetry is replaced by symmetry under coordinate transformations; rotational symmetry is replaced by 
symmetry under local Lorentz transformations (which include Lorentz boosts as well as spatial rotations). 
The matter energy-momentum tensor Tym satisfies a conservation law (16.145) that arises as a result of 
symmetry under coordinate transformations. The spin angular-momentum tensor ©*"” satisfies a conser- 
vation law (16.136) that arises as a result of symmetry under local Lorentz transformations. The reason for 
the adjective “spin” is that, as seen in Exercises 16.4 and 16.5, spin angular-momentum vanishes for bosonic 
fields such as electromagnetism, but is non-vanishing for fermionic (half-integral spin) fields. 


Exercise 16.4. Energy-momentum and spin angular-momentum of the electromagnetic field. De- 
rive the energy-momentum and spin angular-momentum of the electromagnetic field. The energy-momentum 
and spin angular-momentum of a field are defined by equations (16.117) and (16.121). 

Solution. 
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1. Energy-momentum of the electromagnetic field. The Lagrangian of the electromagnetic field is, 
equation (16.28), 


1 
L = —— g" g" FF 16.147 
Te FkaFw » ( ) 


where the inverse metric is in terms of the vierbein, 

g! = Neme" ea 2 (16.148) 
The fact that the Lagrangian depends on the vierbein only in the symmetrized combination constituting 
the inverse metric guarantees that the energy-momentum tensor is symmetric. The variation of the 
electromagnetic Lagrangian (16.147) with respect to the vierbein is 


1 , 
ôL = =i aly te: (16.149) 
T 


An additional contribution to the energy-momentum comes from variation of the vierbein determinant 
in the volume element, equation (16.120). The resulting tetrad-frame energy-momentum tensor Tp; of 
the electromagnetic field is the symmetric tensor 


1 1 
Tat = 5 | Femti” — Yr Fm F) . (16.150) 
4T 4 


The factor 1/4r factor is for Gaussian units, and is not present in Heaviside units. 

2. Spin angular-momentum of the electromagnetic field. The Lagrangian of the electromagnetic 
field depends on the torsion-free curl of the electromagnetic potential, so does not involve any Lorentz 
connections. Therefore the spin angular-momentum of the electromagnetic field is zero, 


Uimn =0. (16.151) 


Exercise 16.5. Energy-momentum and spin angular-momentum of a Dirac field. Find the energy- 
momentum and spin angular-momentum of a Dirac spinor field. 
Solution. 

1. Energy-momentum of a Dirac field. The Lagrangian of a Dirac field is, equation (41.4), 


L= hy: (òD +m) y iy (Dm), (16.152) 


where the (torsion-full) covariant derivative is D) = On+40 mn a Y™ Aq” (implicit sum over both indices 
m and n). The two terms in the Lagrangian (16.152) are complex conjugates of each other, ensuring 
that the Lagrangian is real. Variation with respect to the vierbein e¥à yields the energy-momentum 
tensor Tj, = e;°Ty%, which is not symmetric in lk, 


Tix = 4  WDib — Did. (16.153) 


The fact that energy-momentum tensor of a Dirac field is not symmetric is associated with the fact that 
the spin-angular momentum of the field does not vanish, §16.11.1. The Dirac Lagrangian L vanishes on 
the equations of motion, so the contribution to the energy-momentum, equation (16.120), arising from 
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variation of the vierbein determinant in the scalar volume element dfx = e d4x°!?° vanishes. Again, the 
two terms in the energy-momentum (16.153) are complex conjugates of each other, ensuring that the 
energy-momentum is real. 

2. Spin angular-momentum of a Dirac field. Variation with respect to the connection I'mnaà yields the 
spin angular-momentum Symp = €,*)mn, Which is a trivector current totally antisymmetric in lmn, 


The possible vector current contribution cancels between the two terms on the right hand side of 
equation (16.152). 


Exercise 16.6. Electromagnetic field in the presence of torsion. Does torsion affect the propagation 
of the electromagnetic field? 

Solution. No. The electromagnetic field equations involve only torsion-free derivatives, so the propagation 
of the electromagnetic field is unaffected by torsion. 


Exercise 16.7. Dirac spinor field in the presence of torsion. How does torsion affect the propagation 
of a massive Dirac spin-4 field? Assume for simplicity that the background metric is Minkowski, that the 
spinor field is uniform (a plane wave) and at rest, and that the spin angular-momentum ¥mnk is uniform. 
Solution. The torsion-free part of the connection vanishes for a Minkowski metric, so the only non-vanishing 
part of the connection is the contortion K,,,,- If the spin angular-momentum is uniform, then so is the 
contortion. The equation of motion of a Dirac spinor field of rest mass m is 


[y (On + {Kma YAY") +m) vy =0. (16.155) 


For simplicity, go to the rest frame of the spinor field, where the particle is in a time-up and spin-up eigenstate 
Y x Epp, equation (14.108), which means that the particle is a particle, not an antiparticle, and its spin is 
along the positive 3-direction. The only Dirac y-matrices that are non-vanishing when acting on a spinor w 
in this state are yo and yı A Y2. Thus the equation of motion in the rest frame is 


(80 + 3Kjo1q) + Koa +m) Y =0. (16.156) 
The solutions are 
oe ETE (16.157) 
where the mass change 6m is 
dm = $ Kjo + Koa = 4TG (Zo1q] — Nba) > (16.158) 


the contortion being related to the spin angular-momentum “jnn% by equations (16.131). Thus the effect of 
torsion is to change the effective mass m of the spinor particle. The trace part of the spin angular-momentum 
produces a mass change that has opposite signs for particles and antiparticles, but is independent of the 
direction of the spin of the particle, while the totally antisymmetric part of the spin angular-momentum 
produces a mass change that depends on the direction of the spin of the particle. As seen in Exercise 16.5, a 
Dirac spinor field produces only a totally antisymmetric spin angular-momentum ©[mnxj- This antisymmetric 
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component is directional, so tends to cancel if the spins of the background system of spinor particles are 
pointed in random directions. The antisymmetric spin angular-momentum is significant only if the spins 
of the background particles are aligned. Whatever the case, since the gravitational coupling G is so weak 
compared to typical electromagnetic couplings, the resulting change in the mass of a spinor is typically tiny. 


16.12 Lagrangian as opposed to Hamiltonian formulation 


In the Lagrangian approach to the least action principle, as opposed to the Hamiltonian approach followed 
above, the Lagrangian is required to be a function of the coordinates and velocities, as opposed to the 
momenta. For gravity, the coordinates are the vierbein e”^, and the velocities are their coordinate derivatives 
de™ /Ox". In the Lagrangian approach, the Lorentz connections yy, are not independent coordinates, but 
rather are taken to be given in terms of the coordinates and velocities e”^ and de"™*/Ox". In other words, 
the Lorentz connections are assumed to satisfy the equations of motion that in the Hamiltonian approach 
are derived by varying the action with respect to the connections. 

The Hilbert Lagrangian depends not only on the vierbein and its first derivatives, but also on its second 
derivatives. To bring the Hilbert Lagrangian to a form that depends only on the first, not second, derivatives 
of the vierbein, the Hilbert action must be integrated by parts. This is precisely the integration by parts 
that was carried out in the previous section §16.9. In the Lagrangian approach, the alternative Lagrangian 
L, given by equation (16.112) provides a satisfactory Lagrangian, once the connections Dyn, are expressed 


ME 


in terms of the vierbein e™® and its first derivatives. 


16.12.1 Quadratic gravitational Lagrangian 


The derivative term on the right hand side of the expression (16.112) for the Lagrangian L, was previously 
determined by Hamilton’s equations to be given by equation (16.106), in which the connection proved to 
be the torsion-free connection. Substituting equation (16.106) (with torsion-free connection bm) brings the 
alternative Lagrangian (16.112) to 


mr 


8r Li, = e™rà (- OT? P pne + aw) = (- al met RaT) ; (16.159) 


the last step of which follows from expanding the torsion-full connection as a sum of the torsion-free con- 

nection and the contortion tensor, Fmns = Daik + Kins, equation (11.55). The torsion-free connections 

Dons = ef p pnk here are given by expression (16.110) (same as equation (11.54)), which are functions of the 

vierbein, linear in its first derivatives. The Lagrangian (16.159) is quadratic in the torsion-free connections, 

and therefore quadratic in the first derivatives of the vierbein, but independent of any second derivatives. 
If torsion vanishes, as general relativity assumes, then 


8r Liy = — eA gaa (16.160) 


Thus, for vanishing torsion, the first (“surface”) term in the original alternative Lagrangian (16.112) equals 
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minus twice the second (“quadratic”) term. Padmanabhan (2010) has termed this property of the Hilbert 
Lagrangian “holographic,” and has suggested that it points to profound consequences. 


16.12.2 A quick way to derive the quadratic gravitational Lagrangian 


There is a quick way to derive the quadratic gravitational Lagrangian (16.159) that seems like it should not 
work, but it does. Suppose, incorrectly, that the Lorentz connections I, formed a coordinate and tetrad 
tensor. Then contracting the Riemann tensor would give the Ricci scalar in the form 


R=2 Dp (AT mna) Be ay ae) (16.161) 


Discarding the torsion-free covariant divergence recovers the quadratic gravitational Lagrangian (16.159). 
Why does this work? The answer is that, as discussed in §16.12, although [,,,,, is not a tetrad tensor, it is a 
tetrad tensor with respect to infinitesimal tetrad transformations about the value that satisfies the equations 
of motion. In the Lagrangian formalism, the connections are assumed to satisfy their equations of motion. 
Since least action invokes only infinitesimal variations of the coordinates and tetrad, for the purposes of 
applying least action, the argument e™”*^I mna of the covariant divergence can be treated as a tensor, and 
the covariant divergence thus discarded legitimately. 


16.13 Gravitational action in multivector notation 


The derivation of the gravitational equations of motion from the Hilbert action can be translated into 
multivector language. Translating into multivector language does not make calculations any easier, but, 
by removing some of the blizzard of indices, it makes the structure of the gravitational Lagrangian more 
manifest. The multivector approach followed in this section 16.13 is a stepping stone to the even more 
compact, abstract, and powerful notation of multivector-valued differential forms, dealt with starting from 
§16.14. 


16.13.1 Multivector gravitational Lagrangian 


In multivector notation, the Hilbert Lagrangian (16.88) is 


ar, oF, 
nae -Jm R) , (16.162) 


™ are the usual coordinate 


implicitly summed over both indices «x and À. In equation (16.162), e" = en"y 
(co)tangent vectors, equation (11.6), and the bivectors T, and R,,, are given by equations (15.20) and 
(15.25). The dot in equation (16.162) signifies the multivector dot product, equation (13.35), which here is 
a scalar product of bivectors. The order of eò ^e" is flipped to cancel a minus sign from taking a scalar 


product of bivectors. 


16.138 Gravitational action in multivector notation 445 


Applying the multivector triple-product relation (13.39) to the derivative term in the rightmost expression 
of equation (16.162) brings the Hilbert Lagrangian to 


1 
8 167 ( 
where 0 = e" 0/0x". The form of the Lagrangian (16.163) indicates that the “velocities” corresponding 


to the “coordinates” T, are 0-T). The Lagrangian (16.163) is in (super-)Hamiltonian form with bivector 


coordinates I), vector velocities ô- T), and vector momenta e*/87, 
1 
De e.(8-T))— Hg, (16.164) 


and (super-)Hamiltonian H,(I'y,e*) (compare (16.93)) 


L 2e*-(8-T))+5(e*Ae*)-(T,,T]) , (16.163) 


H, = —~— (eò ^ e") - [L rh]. (16.165) 


Whereas in tensor notation the gravitational coordinates and momenta appeared to be objects of different 
types, with different numbers of indices, in multivector notation the coordinates and momenta are all mul- 
tivectors, albeit of different grades. In multivector notation, the number of coordinates F, and momenta eò 
is the same, 4. 


16.13.2 Variation of the multivector gravitational Lagrangian 


In multivector notation, the fields to be varied in the gravitational Lagrangian are the Lorentz connection 
bivectors T) and the coordinate vectors e“. In multivector notation, when the fields are varied, it is the 
coefficients [',;, and ex” that are varied, the tetrad basis vectors yp being considered fixed. Thus the variation 
ôT of the Lorentz connections is 


OD) = 4 (Tkn) AY (16.166) 


(implicitly summed over all indices; the factor of $ would disappear if the sum were over distinct antisym- 
metric indices kl). The variation de” of the coordinate vectors is 


de” = (ep) yë . (16.167) 


As remarked in §16.8, when the vierbein are varied, the variation of the determinant e of the vierbein that 
goes into the scalar volume element dfx = e d*x°'** must be taken into account. The variation of the vierbein 
determinant is related to the variation ĝe“ of the coordinate vectors by 


lne = —e*,, de,” = —e,, - de" . (16.168) 


The variation of the gravitational action with the multivector Lagrangian (16.169) is 


Sg = x [ie ne’) . (2 on + 3AT) te té(ee* ne"). R,y| d'z. (16.169) 
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The first term in the integrand of equation (16.169) integrates by parts to 


. OOF o : —10(ee ne" 
(e* Ae"): — =D, ((e Ae") - 60) — — CONE) Fi. (16.170) 
Ox’ Ox" 
The second term in the integrand of equation (16.169) is 
Leò Ae*)- (Ex, Ta] = (e* ^e") (Px, dV a] = [eò Ae", Tx] SEA , (16.171) 


the last step of which follows from the multivector triple-product relation (13.39) and the fact that (half) 
the anticommutator of two bivectors is the bivector part of their geometric product. The third term in the 
integrand of equation (16.169) is 
R,.\-e~'d(ee* Ae") = 2 Rex (e* Ade") — (Rya - (e* Ae*)) e,, ôe” 
= (2 Rax: eò — Rex): de" 
=2G,-de* , (16.172) 


where the second line again follows from the multivector triple-product relation (13.39), and G,, is the 
Einstein vector 


Gx = Rxx i e` Re, = (Rem Rems) y™ . (16.173) 


The manipulations (16.170)—(16.172) bring the variation (16.169) of the action to 


-1 A K 
Sg = == fleanen) T> da" + = f ( 2 a L ) +e aen, T) -La + G,, - de" dr. 


8T Ox" 
(16.174) 
The surface term vanishes provided that I is held fixed on the boundaries of integration. Extremizing the 
action (16.174) with respect to the variation de" of coordinate vectors yields the Einstein equations in vacuo, 


G, =0. (16.175) 


Extremizing the action (16.174) with respect to the variation ôI) of the Lorentz connections yields the 
multivector equivalent of equation (16.106), 


e 10(ee* Ae”) 
ox" 
The left hand side of equation (16.176) is 
ed(ee* ne") 
ox" 


The “velocities” of the coordinate vectors are their curls 8 A^ eò, 


= $[e*Ae* Ty] - (16.176) 


= JdONe—e*A(e,:(OAe")) . (16.177) 


Oe* -d 


ONe\=e"A ae nn ee (16.178) 


à a = e”dimn in equation (16.178) are the vierbein 


mn = 


implicitly summed over both indices m and n. The d 
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derivatives defined by equation (11.33). Equation (16.176) solves to yield the torsion-free relation between 
the connections I and the velocities 0 ^ eò of the coordinate vectors, 


M=P=d\e-e. (e, (OAe")), ane = Page", (e, AT) ; (16.179) 


16.13.3 Alternative multivector gravitational action 


As in §16.9, since the Lagrangian (16.162) is in Hamiltonian form, the coordinates T, and momenta eò can 
be traded without changing the equations of motion. Integrating the Lagrangian (16.162) by parts gives 


e718 ((eòne")-Ta) etd(ee* ne") 


ae ox" ox" 


-Ta + 4(erAe*)- [Eh r]. (16.180) 


As in §16.9, the connection T is not a tetrad tensor, but any infinitesimal variation of it is, §16.7.1, 
so the variation of the first term on the right hand side of equation (16.180) is a covariant divergence 
D,.6 ((e* Ae“) -T)), which can be discarded from the Lagrangian without changing the equations of mo- 
tion. 

The middle term on the right hand side of equation (16.180) can be written 


e 10(ee* Ae”) 


Ta. TF =m: (8 ^e`), (16.181) 


where 0A eò is given by equation (16.178), and 7 is the trace-modified Lorentz connection bivector 
my =r- e, Ale"? Ty), Mhm- ter Alet- Tma), (16.182) 
with components 
na = E nma YAT" . (16.183) 


The components 77) are as given by equation (16.114). 
Discarding the torsion-free divergence from the Lagrangian (16.180) yields the alternative Lagrangian 


L, = Sm -(OAe) — He , (16.184) 
with the same (super-) Hamiltonian (16.165) as before. The alternative Lagrangian (16.184) is in Hamiltonian 
form with coordinates eò, velocities ô A eò, and corresponding canonically conjugate momenta 7) /(87). As 
with the alternative Lagrangian (16.112) in index notation, the alternative Lagrangian (16.184) in multivector 
notation is not a tetrad scalar because the Lorentz connection is not a tetrad tensor, but any infinitesimal 
variation of it is a (coordinate and) tetrad tensor, so the alternative Lagrangian (16.184) is satisfactory 
despite not being a tetrad scalar. 
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16.13.4 Einstein equations with matter, in multivector notation 


In multivector notation, Einstein’s equations including matter are obtained by including the variation of the 
matter action with respect to the variation de“ of the coordinate vectors. The variation defines the matter 
energy-momentum vector Tp, 


bm — - fr, - de" d'r ; (16.185) 
with 


The combined variation of the gravitational and matter actions with respect to de” is 
8n(dS, + 6Sm) = fe, — 87 T) - de" dîr , (16.187) 


extremization of which yields Einstein’s equations with matter, 


G, = 8rT, . (16.188) 


16.13.5 Spin angular-momentum in multivector notation 


Just as the variation of the matter action with respect to the the variation de” of the coordinate vectors 
defines the matter energy-momentum vector T, so also the variation of the matter action with respect to 
the variation ôI, of the Lorentz connection bivectors defines the spin angular-momentum bivector XÀ, 


Sm = [= - OV) dîr , (16.189) 
with (the minus sign is introduced for the same reason as the minus in equation (15.27)) 
E= lAn YAT", (16.190) 


implicitly summed over both indices m and n. As in §16.11, the usual expression (15.46) for the torsion-full 
tetrad connections T) as a sum of the torsion-free connection I, and the contortion K is recovered, 


T,=1,+K), (16.191) 
provided that the torsion bivector S^ = —5 S>,, Y”? A" is related to the spin angular-momentum bivector 
> by 

S` = 8n(D*—Se*A(e,-E")), S—-eA(e,-S") =8rD>. (16.192) 
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16.14 Gravitational action in multivector forms notation 


Especially in the mathematical literature, actions are often written in the even more compact notation of 
differential forms. The reward, if you can get over the language barrier, is a succinct picture of the structure 
of the gravitational action and equations of motion. For example, forms notation facilitates the intricate 
problem of executing a satisfactory 3+1 split of the gravitational equations, §16.15. If you aspire to a deeper 
understanding of numerical relativity or of quantum gravity, you would do well to understand forms. 

As seen in §16.7, the Hilbert action is most insightful when the local Lorentz symmetry of general rela- 
tivity, encoded in the tetrad Ym, is kept distinct from the symmetry with respect to coordinate transforma- 
tions, encoded in the tangent vectors e,,. The distinction can be retained in forms language by considering 
multivector-valued forms. Local Lorentz transformations transform the multivectors while keeping the forms 
unchanged, while coordinate transformations transform the forms while keeping the multivectors unchanged. 

To avoid conflict between multivector and form notations, it is convenient to reserve the wedge sign / to 
signify a wedge product of multivectors, not of forms. No ambiguity results from omitting the wedge sign for 
forms, since there is only one way to multiply forms, the exterior product”. Similarly, it is convenient to reserve 
the Hodge duality symbol * to signify the dual of a form, equation (15.79), not the dual of a multivector, and 
to write Ia for the Hodge dual of a multivector a, equation (13.24). The form dual of a p-form a = a, d?a* 
with multivector coefficients a, is the multivector g-form “a given by (this is equation (15.79) generalized 
to allow multivector coefficients) 


*a = (*a)n d'a" = (—)P!aq “dia^ = ena a^ d'a" , (16.193) 


implicitly summed over distinct sequences A and II of respectively p and q = N — p (in N dimensional 
spacetime) coordinate indices. The dual (16.193) is a form dual, not a multivector dual. If a is a multivector 
of grade n (not necessarily equal to p or q), the dual form *a remains a multivector of the same grade n. 
The double dual of a multivector form a, both a multivector dual and a form dual, crops up often enough 
to merit its own notation, a double-asterisk overscript **, 


ie ie (16.194) 


In this section 16.14 and in the remainder of this Chapter, unless otherwise stated, implicit sums are over 
distinct antisymmetric sequences of indices, since this removes the ubiquitous factorial factors that otherwise 
appear. For example, the wedge product of two multivectors a and b is 


aN b= (apy") A (b7') = 2 andy YENY = 2a YENY , (16.195) 


implicitly summed over distinct antisymmetric pairs of indices kl. In any expression for a multivector form 

in components, it can be helpful to think of the multivector and form indices as each carrying an implicit 

antisymmetrization symbol [...], as in the example aby of equation (16.195). The antisymmetrization symbol 

will usually not be made explicit, both for brevity and to avoid a certain awkwardness of notation. 

2 This is not true. The entire apparatus of multivectors can be translated into forms language. However, I take the point of 
view that, since multivectors are easier to manipulate than forms, there is not much to be gained from such a translation. 


The only occasion I find that necessitates introducing a dot product of forms is in deriving the law of conservation of 
energy-momentum in multivector forms language, equation (16.299). 
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It is convenient to adopt the convention that the commutator of a multivector p-form a with a multivector 
q-form b is commuting if p and q are both odd, anticommuting otherwise, 


_ f [b,a] pand qodd, 

eel { —|b,a] otherwise . (DEUG 
The advantage of this convention is that the contribution of the Lorentz connection to the covariant derivative 
of any multivector form a is always the commutator Z[T, al. For example, the expression (15.26) for the 
Riemann tensor in terms of the commutator of the covariant derivative carries through to the language of 
multivector-valued forms, equation (16.208). The anticommutation of the multivectors is deemed to cancel 
the anticommutation of forms when p and q are both odd. For example, if a and b are two 1-forms, then 
their commutator is 


[a, b] = [a,., by] d?a** = (a,b) — byan) d?x** = (a,b) + brar) da = [b,a] , (16.197) 
implicitly summed over distinct antisymmetric indices xÀ. As a corollary, the (anti-)commutator of a p-form 
a with itself vanishes if p is (odd) even, 

{a,a}=0 podd, (16.198a) 
[a,a] =0 peven. (16.198b) 


Exercise 16.8. Commutation of multivector forms. 
1. Argue that if a = ax, y% d?x* is a multivector form of grade k and form index p, and b = aga y% dts 
is a multivector form of grade / and form index q, then the grade k +l —2n component of their product 
ab commutes or anticommutes as 


(ab) kiin = (—)* etea (ba) kiin > (16.199) 

As particular cases of equation (16.199), conclude that 
a-a=0 podd, (16.200a) 
a\a=0 k+podd. (16.200b) 


2. What is the form index of the product ab of multivector forms a and b of form index p and q? 


Solution. 
1. This is a combination of equations (13.28) and (15.61). 
2. A product of forms is always their exterior product, so the form index of the product ab is p+ q. 


16.14.1 Interval, connection 


In multivector notation, the gravitational coordinates and momenta proved to be T, and e" (or vice versa). 
In forms notation, the corresponding coordinates and momenta are the Lorentz connection bivector 1-form 
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T and the line interval vector 1-form e defined by 


T =T, dr" = Ppi y? Ay dr" (16.201a) 


e = ep dx" = epn Y" dx" |, (16.201b) 


with, for T, implicit summation over distinct antisymmetric sets of indices kl. The Lorentz connection 1- 
form T and coordinate interval 1-form e are abstract coordinate and tetrad gauge-invariant objects, whose 
components in any coordinate and tetrad frame constitute the Lorentz connection Iks and the vierbein ex, 
in the mixed coordinate-tetrad basis. 

The line interval e is essentially the same as the object dx first introduced in this book in equation (2.19). 
I contemplated using the symbol dæ in place of e everywhere in this Chapter, to emphasize that using forms 
language does not require switching to a whole new set of symbols. But e is the symbol for the line-interval 
form conventionally used in the literature; and the symbol dæ risks being misinterpreted as a composition of 
d and æ (for example, an exterior derivative of x), as opposed to the single holistic object dx that it really 
is. Moreover, if the dot product e- e is defined (as here) to be a form, then that dot product is not the same 
as the scalar spacetime interval squared ds? = dæ - dx, equation (2.25) (see Concept Question 16.9). 

It is convenient to use the symbol e” to denote the normalized p-volume element, 


p terms 
1 a~ 
e = ENNE, (16.202) 
p! 


which is both a p-form and a grade-p multivector. The factor of 1/p! compensates for the multiple counting 
of distinct indices, and ensures that e? correctly measures the p-volume element. 


Concept question 16.9. Scalar product of the interval form e. In Chapter 2, the scalar product of 
the line interval with itself defined its scalar length squared, ds? = dx - dx = g,,, dxdx”, equation (2.25). 
Is this still true in multivector forms language? Answer. No. A differential p-form represents physically a 
p-volume element, and as such is always a sum of antisymmetrized products of p intervals. The scalar product 
of the interval 1-form e with itself is 


e-e =2guu dx =0 (16.203) 

(implicitly summed over distinct antisymmetric sequences uv, hence the factor 2). The scalar product van- 
ishes because of the symmetry of the metric g,,, and the antisymmetry of the area element d’°x””. 

A different version of a dot product of forms (not much used in this book) can be defined in precise analogy 


to a dot product of multivectors to yield a form of smaller form index, equation (16.284). This form dot 
product of the interval 1-form e with itself yields the 0-form 


e- e = Epen Y Y = Nma Y Y =4, (16.204) 


which again differs from the scalar product ds? = dg - dg. 
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16.14.2 Curvature and torsion forms 
The Riemann bivector 2-form R is defined by 
R= R, dx = Rexmn YAY PIA , (16.205) 


again implicitly summed over distinct antisymmetric indices mn and «A. The exterior derivative of the 
Lorentz connection 1-form is, equation (15.70), the 2-form 


Ory or, 2 KÀ OV mnd Omni DEX, 
T= | ——- —] dc = m ngg" 16.2 
i (o Ou ) 7 Ox" oe ft es pe 


implicitly summed over distinct antisymmetric indices mn and xà. The commutator slr, T] of the 1-form T 
with itself is the bivector 2-form 


LEE Hsia bale = (I2 lone — oa YAY ee (16.207) 


again implicitly summed over distinct antisymmetric indices mn and Kà. The commutator [[,T] of the 
bivector 1-form T is symmetric, the anticommutation of multivectors cancelling against the anticommutation 
of 1-forms, equation (16.196). Equations (16.206) and (16.207) imply that the Riemann 2-form R is related 
to the Lorentz connection 1-form T by 


R=dr+(r,T)]. (16.208) 


Equation (16.208) is Cartan’s second equation of structure. It constitutes the definition of Riemann 
curvature R in terms of the Lorentz connection P. 
The torsion vector 2-form S is defined by (the minus sign ensures that Cartan’s equation (16.212) takes 


conventional form, given the definition (11.48) of the components 5”) 
S==S, 09° P> , (16.209) 


implicitly summed over distinct antisymmetric indices KÀ. The exterior derivative of the line interval 1-form 
e is the 2-form 


2 z ¥ ð T ð lai m K m K 
de = & — er) Gem = ( m = a Jr PI` = Adm” P>, (16.210) 


again implicitly summed over distinct antisymmetric indices «A. The d»,, in the rightmost expression of 
equations (16.210) are the vierbein derivatives defined by equation (11.32). The commutator 3[I', e] of the 
1-forms T and e is the vector 2-form 


žE, e] = Ik, en] dah =-2 ars. y” oa 5 (16.211) 


implicitly summed over distinct antisymmetric indices xà. The fundamental relation (11.49), or equiva- 
lently (15.29), between the torsion and the vierbein derivatives and Lorentz connections translates in multi- 
vector forms language to, from equations (16.209)—(16.211), 


S =de + 4[T,e]|. (16.212) 
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Equation (16.212) is Cartan’s first equation of structure. Cartan’s equations of structure (16.212) 
and (16.208), introduced by Cartan (1904), are not equations of motion; rather, they are compact and 
elegant expressions of the definition (11.58) of torsion and curvature. Equations of motion (16.250) for the 
torsion and curvature are obtained from extremizing the Hilbert action. 


16.14.3 Area and volume forms 


The other factor in the gravitational Lagrangian (16.162) is e" A eò. The 2-form corresponding to e" ^ e" is 
the element of area e? defined by equation (16.202), 


e = ene = eg Ae, dr^ = (emkêny — EmrCnn) "AY PA , (16.213) 


implicitly summed over distinct antisymmetric pairs mn and KA of indices. 
The exterior derivative de? of the p-volume element is 


de? = (—)P“"eP Ade. (16.214) 


The 1/p! factor in the definition (16.202) of the p-volume element absorbs the factor of p from differentiating 
p products of e. The (—)?~! sign comes from commuting de past e?~!. 

The form dual of the p-volume e” is the dual g-volume, *(e”) = *e?, which in turn equals the pseudoscalar 
I times the qg-volume, equation (15.80), 


*(e?) = *e1 = Iel. (16.215) 


The p-volume and its q-form dual are both multivectors of grade p. For example, the form dual, equa- 
tion (16.193), of the area element e° is the dual area element *e?, 


*e? = Exrduy eh Ae” da = 2 Exdpy Em” en” YAY” da = 2 Ektmne" kela Yr Ay” PLA , (16.216) 


implicitly summed over distinct antisymmetric indices KA, uv, kl, and mn. The exterior derivative d *e?% of 
the dual g-volume element is 


d*e? = (~)N~"I de? = (—)PI (e17! Ade) = (—)P(I e17!) «de = (—)P*e?! . de , (16.217) 


the third equality following from the duality relation (13.41). 


Exercise 16.10. Triple products involving products of the interval form e. Let a be a multivector 
form of grade n and any form index. 
1. Show that 


e\(e-a)=e-(eAa). (16.218) 


2. Conclude that if n > q then 
e” A(e7- a) =e! - (e? ^a). (16.219) 
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3. Prove that the grade p +n — q part of the multivector form e?*%a is 
(eP 4a) ning = €f- (e Aa). (16.220) 
Equation (16.220) is equivalent to 
(P41 a) pan—q = (ef (ePa)ptn)p+n-a - (16.221) 
The proof below of equations (16.220) or (16.221) uses the triple-product relation (13.40). The proof 
demonstrates along the way the triple-product relation 
(k+l)! (pt+tq—k-D)! 
Piel = p+q 
(e (e Q)q+n—21)p+q+n—2k—21 = kil! (p— k)\(q = 1)! (e Q) pt qtn—2k—2I s (16.222) 
Solution. 
1. Equation (16.218) can be proved by expanding the multivector forms e and a in components. Equa- 


tion (16.218) remains true in the special case where a is a scalar (grade 0), in which case e- a = 0 by 
definition (13.36), and e - e = 0, equation (16.203). 


2. Equation (16.219) follows from successive application of equation (16.218). 


Equation (16.221) can be proved by induction. Certainly equation (16.221) holds for p = 0 or q = 0, in 
which case the equation becomes an identity. Recall that, in view of the way that volume elements e” 
are normalized, equation (16.202), 


! 
erti PT opp ed, (16.223) 


The triple-product relation (13.40), along with the fact that e- e = 0, implies 


q = 
(oP 41a) o4q4n—2m = a 1 X CC) rere) err verms : (16.224) 
p+! 


Assume that equation (16.220) is inductively true up to some p and q. The inductive hypothesis (16.220) 
implies 


(e%a)gin—2 = e! - (e1 ^a) (16.225) 


subject to the conditions that q — l, l, n, and q +n — 2l are all non-negative integers. Inserting the 
hypothesis (16.225), and a similar one for (e”...), into equation (16.224) implies that the summand on 
the right hand side of equation (16.224) is 


(eP? (e714) g4n—21)p+qtn—2k—21 = e*- (eP™™ A (e! - (e ^a))) . (16.226) 


The sum in equation (16.224) is over non-negative integers k and l satisfying k +1 = m, k < p, and 
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l < q. By equation (16.219), the summand (16.226) rearranges as 


(e” (efa)q Hn 21)p tqtn—2k—21 = (e* A e!) . (eP =} Net N a) 
_ (k+)! (p+q-k-l) 
~ U (p—k)Mq—1) 

m! (pt+tq—m)! 


~ Ru! (p — kq- i)! 


| 
“k+l . ptq—-—k-l 
re (e Aa) 


e™ . (ePtI-™ Na). (16.227) 


Equation (16.224) thus reduces to 


plq! y m!(p +q—™m)! (16.228) 


p+q —=e™.(eptq—m 
(eP A) p+atn—2m =e" (e Aala) Elp Aq! 


k+l=m, k<p, l<q 


The summed term on the right hand side of equation (16.228) equals (p + q)!/(p!q!), cancelling the 
prefactor p!q!/(p + q)!. To prove this, it suffices to restrict to p = 1 or q = 1, with m > 1 (the result is 
trivial for m = 0), and then the general result follows by induction. For p = 1 the sum is over k = 0 and 
1, while for q = 1 the sum is over l = 0 and 1. For example, for q = 1, 


u gd m\(p +1—m)! 1 
= py ee Cee era | = eng oe ot ae a oe 


Thus, at least for p = 1 or q = 1, equation (16.228) reduces to 
(€?*1a) n4q4n—2m = e" - (ePt-™ Aa) , (16.230) 


reproducing the to-be-proved equation (16.220). The result for p = 1 or q = 1 establishes the desired re- 
sult (16.220) inductively for all p and q. Equation (16.227) and (16.230) together imply equation (16.222). 


16.14.4 Gravitational Lagrangian 4-form 


Recall that the scalar volume element dfx that goes into the action is really the dual scalar 4-volume *d*z, 


equation (15.80). To convert to forms language, the Hodge dual must be transferred from the volume ele- 


ment to the integrand. In multivector language, the required result is equation (16.55), invoked previously 


to convert the electromagnetic Lagrangian to forms language. Translated back into forms language, equa- 


tion (16.55) says that a “scalar product” of 2-forms a and b over a dual scalar volume element is the 4-form 
equal to the exterior product of the dual form *a with the form b. 


The gravitational action thus becomes 


S = | ie (16.231) 


where Lg is the gravitational Lagrangian scalar 4-form corresponding to the Lagrangian (16.162), 


1 1 
Ly =- "è. R=- *e?- (aT + 4I0,T)) f (16.232) 
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The dot in *e? - R signifies a scalar product of the bivectors *e? and R. The minus sign comes from taking 
a scalar product of bivectors. The product *e? - R is an exterior product of two 2-forms, hence a 4-form. As 
remarked at the beginning of this section 16.14, the wedge sign for the exterior product of forms is suppressed 
because it conflicts with the wedge sign for a multivector product, and because it is unnecessary, there being 
only one way to multiply forms. The Lagrangian 4-form (16.232) is in Hamiltonian form Lg = p- dq — Hs 


with coordinates q = T and momenta p = —*e?/(87), and (super-)Hamiltonian 4-form 
1 
H, = — *e . (Ft). (16.233) 
327 


The Lagrangian scalar 4-form (16.232) can be written elegantly, from the expression (16.215) for the dual 
volume *e? and the duality relation (13.41), 


I I 
Lg =- ®AR= -z eA (dr + FLT) . (16.234) 


Expanded in components, the gravitational Lagrangian 4-form (16.232) or (16.234) is 


L = =E Epvnp (€ Ae’): Rpa d'r = = e, ^e, A Rpa r , (16.235) 
8T 8T 
implicitly summed over distinct antisymmetric indices kÀ, uv, and mp. The Lagrangian 4-form (16.234) 
is in Hamiltonian form Lg = I(p^ dq) — Hz with coordinates q = T and momenta p = —e?/(87), and 
(super-)Hamiltonian scalar 4-form 
H, = Ea e AfL, T] |. (16.236) 


16.14.5 Variation of the gravitational action in multivector forms notation 


Equations of motion for the gravitational field are obtained by varying the action with respect to the Lorentz 
connection I and the line-element e. In forms notation, when the fields are varied, it is the coefficients Tkig 
and e,,, that are varied, the tetrad y* and the line interval dz“ being considered fixed. Thus the variation 
ôT of the Lorentz connection is 


OV = (61) dx” = (Vein) YË Ay! dz" , (16.237) 
implicitly summed over distinct antisymmetric indices kl, and the variation de of the line interval is 
de = (5e,) dz" = (epn) y* dz" . (16.238) 
The variation de” of the p-volume element defined by equation (16.202) is 
de? = e”! Ade. (16.239) 
The variation 6 *e? of the dual g-volume element is 


ô “el = I ĝe! = I(e* | Ade) = (Iet) - de = “et - ôe , (16.240) 
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the third equality following from the duality relation (13.41). 
The variation of the action with gravitational Lagrangian (16.234) with respect to the fields T and e is 


oe -5 | e? Ad(6D) + t e?n lE, T] +5(e2) AR. (16.241) 
The first term integrates by parts to 
e? ^d(ôT) = d(e? ^ ôT) — d(e?) AST . (16.242) 
The second term in the integrand of (16.241) is 


łe? AOL, T] = $e? AIL, ôT] = 4[e?, T] ^ ôE = —4[r, e°] ^A ôT , (16.243) 


the second step of which applies the multivector triple-product relation (13.39). The coefficients of the A ôT 
terms in equations (16.242) and (16.243) combine to 
— d(e?)—4[f,e7]=eAS, (16.244) 


the torsion S being defined by equation (16.212). To switch between commutators |T, e] and commutators 
[T, e°], use the result (16.218) along with the fact that e -a = 4[e,a] for any bivector form a. The third 
term in the integrand of (16.241) is 


5(e2) AR =deheAR=5eNG, (16.245) 
where G = e ^ R is the double dual, equation (16.194), of the Einstein vector 1-form G = Gyn Y” dx’, 


G=I*(e\ R) 
= a aad (N R pumn >" da” 
= —3! elk nen ue y) R” mn Yk dz" 
= (Rik — Repr) y" dx” , (16.246) 
implicitly summed over distinct antisymmetric sequences mn and uv, and over all k, l, k, and À. The factor of 
3! on the third line of equations (16.246) is the number of permutations of the indices of a 3-form. Combining 
equations (16.242)—(16.245) brings the variation (16.241) of the gravitational action to 


iSe =- f nT- = [(ens)nsP+seNerR). (16.247) 


The variation of the matter action Sm with respect to ôI and de defines the spin angular-momentum X 
(compare equation (16.121)), and the matter energy-momentum T (compare equation (16.117)), 


isa =- | Eor+seT=1 | ŽA + ien, (16.248) 


xk xk 
where X and T are the double duals, equation (16.194), of the spin angular-momentum © and energy- 
momentum T of the matter. The components of the spin angular-momentum bivector 1-form % and the 
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energy-momentum vector 1-form T are (the minus sign in the definition of X conforms to the convention for 
the definition of torsion S, equation (16.209)) 
E = 5, dr" = —Epm Y! AY” dz" , (16.249a) 
T = T; dz" = Tem Y” dx" , (16.249b) 
with, for X, implicit summation over distinct antisymmetric sets of indices kl. Extremizing the combined 


gravitational and matter actions with respect to ôI and de yields the torsion and Einstein equations of 
motion in the form 


enS =8r>|, (16.250a) 


e\R=8rT\. (16.250b) 


The torsion equation of motion (16.250a) is a bivector 3-form with 6 x 4 = 24 components, while the 
Einstein equation of motion (16.250b) is a pseudovector 3-form with 4 x 4 = 16 components. The Einstein 
equation (16.250b) is equivalent to the traditional expression 


G=8rT. (16.251) 


The contracted Bianchi identities (16.406) enforce conservation laws for the total spin angular-momentum 


> and total matter energy-momentum T, §16.14.8 and §16.14.9. 

Notice that if the area element e? had been taken to be the momentum conjugate to T, rather than the 
line element e, all the components of the area element e? being considered to be independent degrees of 
freedom, then the variation (16.241) of the gravitational action with respect to e? would have yielded an 
equation of motion for the Riemann tensor R rather than for the Einstein tensor G, and the theory would 
not be general relativity. To recover general relativity, it is necessary to treat the area element as a wedge 
product e? = $e ^e of the line interval e. 


16.14.6 Alternative gravitational action in multivector forms notation 


As in §16.9 and §16.13.3, the coordinates and momenta can be traded without changing the equations of 
motion. Integrating the —e? ^ dP term in the Lagrangian (16.234) by parts gives 


—e? \dI = —d(e? AT) + de A(eAT) 
=d0+mAde, (16.252) 


where m is the momentum conjugate to e, a trivector 2-form with 24 components, 


and # is the expansion, the contraction of m, a pseudoscalar 3-form with 4 components, 


v= jerr =- AT]. (16.254) 
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The double dual of the expansion is the scalar 1-form 
Ü =r”, de”. (16.255) 
The transpose (16.271) of the double dual of the expansion is 
DT =r, yk (16.256) 


whose tetrad time component Tým is what is commonly called the expansion, §18.3, justifying the nomen- 
clature. 

The total exterior derivative term dv? in equation (16.252) is the Gibbons-Hawking-York boundary term 
(York, 1972; Gibbons and Hawking, 1977). Discarding the boundary term d% yields the alternative La- 
grangian 


= I I 1 
L, = Lg do = <x ^de H, = 7^ (de + aI, el) : (16.257) 


with H, is the same (super-)Hamiltonian as before, equation (16.236). The alternative Lagrangian (16.257) 
is in Hamiltonian form with coordinates e and momenta 7/(87). 

The Lorentz connection I, which is a bivector 1-form, and the momentum m, which is a pseudovector 
2-form, both have the same number of components, 24. The components are invertibly related to each other, 
the Lorentz connection I being given in terms of the momentum m by 


T= +ead", (16.258) 


where | denotes the transpose operation (16.271). 
Variation of the gravitational action Sj with the alternative Lagrangian (16.257) yields 


I I 


where the curvature pseudovector 3-form II is defined to be 


W=e\R—SAT=dnr+3[l,n]—- į es^fr,T]]. (16.260) 


Previously, variation of the matter action Sm with respect to ôI and de defined the (double duals of the) spin 
angular-momentum © and matter energy-momentum T, equation (16.248). Variation of the matter action 
Sm instead with respect to ôm and de defines modified versions È and T of the spin angular-momentum and 
energy-momentum, 


Sy = | -önn S+ Parse. (16.261) 


where the vector 2-form ¥ is (the minus sign conforms to the convention for the torsion § and spin angular- 
momentum Y, equations (16.209) and (16.249a)) 
È = -ymn Y” Aq” da? . (16.262) 


The original © and modified 5 spin angular-momenta are invertibly related to each other (similarly to the 
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way that the momentum 7 and connection T are invertibly related, equations (16.253) and (16.258)), while 
the (double dual of the) original T and modified T energy-momenta differ by a term depending on the spin 
angular-momentum, 


S=eAD, =X -er , (16.263a) 
T=T-ZAT, (16.263b) 


ek 
where ø is the contraction of X, a pseudoscalar 3-form with 4 components, 


azten- AŤ, (16.264 
In components, the relation (16.263a) between the original © and modified È spin angular-momenta is 
equation (16.130). The components of the double dual & and its transpose are (compare equations (16.255 


and (16.256) for Ý and 97) 


@ =D de’ = idk de’, F =DE y" = inky. (16.265 


The equations of motion for the torsion S and curvature IT are 


S=8rd], (16.266a) 


II = 8rT'|. (16.266b) 


More explicitly, the equations of motion (16.266) are 


de + 4(L,e] = 81$ , (16.267a) 
dr + iL, a] — te AfL, T] = 87T . (16.267b) 
The expansion ¥ is a pseudoscalar, so its exterior derivative equals its Lorentz-covariant exterior derivative, 

dð =dv+ ZE, J], which is 
dv = į (de + 4[L,e]) Ar — 4 e^ (dr + 4[L,7]) , (16.268) 

which rearranges to 

dð + 4e AL, r] =-PAR+eASAT. (16.269) 
The first term on the right hand side of equation (16.269) is proportional to the double-dual of the trace G 


2k 
of the Einstein tensor, e? A R = G. If e? ^A Rand e ^ S are replaced by their matter energy-momentum and 
spin angular-momentum sources in accordance with equation (16.250), then the equation of motion (16.269) 
for the expansion becomes 


dð + fe? A(L,T] = 8n(—2 T+ SAL) (16.270) 
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16.14.7 Transpose of a multivector form 


The transpose a! of a multivector form a = aga y* d’x^ of grade n and form index p is defined to be the 
multivector form, of grade p and form index n, with multivector and form indices transposed, 


a 
al = (axa ak Px”) = eK ge^ aga Y” Px" = ang y” dfa" , (16.271) 
implicitly summed over distinct sequences K, L, A, II of indices. For example, the transpose of a vector 


2-form is the bivector 1-form 


k 


T i . ; 
(akau YE PI) = eë erem" akay YAY” da" = arim yi AY” da" . (16.272) 


The transpose of a symmetric tensor a, one satisfying, ax, = age, = akela = are, is itself, 


T 


a = (ak) ak dx*)" = Q)k al dx* = Ak ak dx* =a. (16.273) 


As a particular example, the vierbein is symmetric in this sense, because the tetrad metric is symmetric, 
eka = Me! a, so the transpose of the line interval e is itself, 


T 


e = (ek) yë dò)! 


= e" peeky Y! dz" = eip Y' dz" =e. (16.274) 
The transpose of a wedge product of multivector forms a and b is the wedge product of their transposes, 

(anb)! =a' ^b". (16.275) 
The transpose of the double dual of a multivector form a is the double dual of its transpose, 


a =a'. (16.276) 


Equations (16.275) and (16.276) say that the operation of transposition commutes both with taking the 
wedge product and with taking the double dual. Note however that the operations of taking the wedge 
product and taking the double dual do not commute. 


16.14.8 Conservation of angular momentum in multivector forms language 


The action Sm of any individual matter field is Lorentz invariant. Lorentz symmetry implies a conservation 
law (16.281) of angular momentum. 

Under an infinitesimal Lorentz transformation generated by the bivector € = ep, Y? Aql, any multivector 
form a whose multivector components transform like a tensor varies as, equation (16.95), 


da = 5[e,a] . (16.277) 


In particular, since the vierbein ez, is a tetrad vector, the variation of the line interval e = ex, oi dx” under 
an infinitesimal Lorentz transformation is 


de = $[e,e] . (16.278) 


The components of the Lorentz connection T = mna Y” AY” dx* do not constitute a tetrad tensor, so do 
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not transform like equation (16.277). Rather, the Lorentz connection transforms as equation (16.96), which 
in multivector forms language translates to 


ôr = — (de + 4[L, €]) . (16.279) 


Inserting the variations (16.278) and (16.279) of the line interval e and Lorentz connection I into the 
variation (16.248) of the matter action yields the variation of the matter action under an infinitesimal 
Lorentz transformation generated by the bivector e, 


Sp =T | -ËA (de + 310 el) + Heel AF 
=1$Ene-1 f (a8 + EB-e-Pare. (16.280) 


Invariance of the action under local Lorentz transformations requires that the variation (16.280) must vanish 
for arbitrary choices of the bivector € vanishing on the initial and final hypersurfaces. Consequently the spin 


2k 
angular-momentum © must satisfy the covariant conservation equation 


ek 


aSi+ir,S)-e-T =o}. (16.281) 


Equation (16.281) is the same as the conservation equation (16.136) derived previously in index notation. 
Equation (16.281) is consistent with the contracted torsion Bianchi identity (16.406a), which enforces the 
angular-momentum conservation equation (16.281) summed over all species. If the spin angular-momentum 
of a matter component vanishes, then the conservation equation (16.281) implies that the energy-momentum 
tensor of that matter component is symmetric, 


e-T=0. (16.282) 


16.14.9 Conservation of energy-momentum in multivector forms language 


The action Sm of any individual matter field is invariant under coordinate transformations. Symmetry under 
coordinate transformations implies a conservation law (16.299) for the energy-momentum of the field. 
Any infinitesimal 1-form € = e, dx“ generates an infinitesimal coordinate transformation 


cea atte. (16.283) 


As discussed in §7.34, the variation of any quantity a with respect to an infinitesimal coordinate transforma- 
tion € is, by construction, minus its Lie derivative, —C.a, with respect to the vector e". The Lie derivative 
of a form is written most elegantly in terms of a dot product of forms. As usual, algebraic operations with 
forms are derived most easily by translating from multivector language into forms language. Thus the dot 
product of a 1-form e€ with a p-form a = a, d?x‘ is, mirroring the multivector dot product (13.35) (the form 
dot - is written slightly larger than the multivector dot - to distinguish the two), 


€-a= pe aga dia , (16.284) 
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implicitly summed over distinct antisymmetric sets of indices «A. The form dot product of a 1-form € and 
a 0-form is zero, consistent with the convention (13.36) for multivectors. A useful result is that for any two 
multivector forms a and b with product ab (a geometric product of multivectors and an exterior product of 
forms), the dot product of the 1-form e with the product ab is 


e-(ab) = (€- a)b+ (— Pale - b), (16.285) 


where p is the form index of a. 

From the definition (7.151) of the Lie derivative of a coordinate tensor, it can be shown (Exercise 16.11 
asks you to do this) that the Lie derivative of a p-form a with respect to a 1-form e€ is given by the elegant 
expression 


Lea =e-(da)+d(e-a) , (16.286) 


which is known as Cartan’s magic formula. Cartan’s magic formula (16.286), along with the vanishing of 
the exterior derivative squared, d? = 0, implies that, acting on forms, the Lie derivative Le commutes with 
the exterior derivative d, 


Leda — dLea = 0 . (16.287) 


Cartan’s magic formula (16.286) holds also for multivector-valued forms, since multivectors are coordinate 
scalars (they are unchanged by coordinate transformations). However, a difficulty arises because the Lie 
derivative of a tetrad tensor is not a tetrad tensor (see Concept Question 26.2). Consequently the Lie 
derivative of neither the line interval nor the Lorentz connection is a tetrad tensor. However, as pointed 
out at the beginning of §5.2.1 of Hehl et al. (1995), the Lagrangian is a Lorentz scalar, so in varying the 
Lagrangian 4-form L, the exterior derivative can be replaced by the Lorentz-covariant exterior derivative, 
da + Dra = da + $[I, a] (see §16.17.1), 


LeeL = LreL = e (DLL) + Dr (e - L). (16.288) 

Thus the variation of the Lagrangian under a coordinate transformation can be carried out using the Lorentz- 
covariant Lie derivative 

Lrea = e (Dra) + Dy (e-a) (16.289) 

in place of the usual Lie derivative (16.286). The advantage of this replacement is that the Lorentz-covariant 

Lie derivatives of the line interval and Lorentz connection are then (coordinate and tetrad) tensors, and 

the resulting law of conservation of energy-momentum is manifestly tensorial, as it should be. The Lorentz- 

covariant derivative D; is torsion-free acting on coordinate indices, but torsion-full acting on multivector 


(Lorentz) indices. An alternative version of the covariant magic formula (16.289) in terms of the torsion-free 
exterior derivative D and the contortion K is 


Lrea = € (Da) + D(e -a) + tle- K,a], (16.290) 


which follows from T = I + K and the relation (16.285). As a particular example of the Lorentz-covariant 
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magic formula (16.289), the variation de of the line interval under an infinitesimal coordinate transforma- 
tion (16.283) generated by e€ is 


de = —Lree = —e€-(D,e) — Dr (€ - e) = —d(e-e) — [L,e -e]— €- S. (16.291) 
Alternatively, in terms of the torsion-free exterior derivative D, 
de = —Lree = —D(e-e) — gle: K,e] = —d(e-e) — IÈ, e- e|—tle-K,e], (16.292) 


which is the forms version of equation (16.139) derived earlier in index notation. 

The Lorentz connection I is a coordinate tensor but not a tetrad tensor, so the covariant magic for- 
mula (16.289) does not apply to the Lorentz connection. Rather, the variation ôI of the Lorentz connection 
follows from the difference 


ôDra — D, da = ô (da + 4[L,a]) — (dôa + 4[L, da]) = $[6T, a] . (16.293) 
Thus the variation ôI of the Lorentz connection under an infinitesimal coordinate transformation generated 


by e€ satisfies 


1ST, a] = — LreDra + Dy Lyea = — e -(DrDra) + DrDr (€ - a) = — 4e -[R, a] + $[R,e-a] = —3[e- Ral , 
(16.294) 
where R is the Riemann curvature bivector 2-form. Equation (16.293) holds for all multivector forms a, so 


Or =-L,0=-e-R. (16.295) 


which is the forms version of equation (16.142) derived earlier in index notation. 

Inserting the variations (16.291) and (16.295) of the line interval e and Lorentz connection I into the 
variation (16.248) of the matter action yields the variation of the action under an infinitesimal coordinate 
transformation (16.283) generated by the 1-form e, 


Sm =I | (dle-e)+4P,e-e] +e- 8) AP + Žale. R) , (16.296) 


Integrating the d(e - e) AT term by parts, and rearranging the 3[I,e- e] AT term using the multivector 
triple-product relation (13.39), yields 


tSn =I flee) Ë+ [(e-e) (ah + HP, Ë) —(e-S)AT—SA(e-R). (16.297) 


Invariance of the action under coordinate transformations requires that the variation (16.297) must vanish 
for arbitrary choices of the 1-form e vanishing on the initial and final hypersurfaces. Consequently the matter 


energy-momentum T must satisfy the conservation equation 
(e-e) A(aT + 4[P,T]) —(€-S)AT—SAe-R)=0. (16.298) 


Equivalently, in terms of the torsion-free connection I and the contortion K ; 


(e-e) A(aT + HÈ, T) — te- K,e]AT —SA(e- R) =0). (16.299) 
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I don’t know a way to recast equations (16.298) or (16.299) in multivector forms notation with the arbitrary 
1-form e factored out, but in components equation (16.299) reduces to equation (16.145) derived earlier. 


xk 
If the spin angular-momentum of a matter component vanishes, & = 0, then the energy-momentum 
conservation law (16.299) for that matter component simplifies to 


aT +1[0,T] =0. (16.300) 


If the energy- momentum conservation law (16. 298) | is summed over all matter components, and the total 


spin angular-momentum > and energy-momentum T eliminated in favour of torsion S and curvature R 
using Hamilton’s equations (16.250), then the law of conservation of total energy-momentum becomes 


(e-e)A(d(eA R) + 3[P,eA R]) —(€-S)AeAR-—eASA(e-R)=0, (16.301) 
which by the relation (16.285) rearranges to 
(e-e)A(d(eAR)+3[P,eAR]—SAR) =0. (16.302) 


Equation (16.302) is true for arbitrary infinitesimal €, so the law of conservation of total energy-momentum 
is 
deA\R)+3[[,eAR])—-SAR=0, (16.303) 


which agrees with the contracted Bianchi identity (16.406b). 


Exercise 16.11. Lie derivative of a form. Confirm from the definition (7.151) that the Lie derivative of 
a p-form is indeed given by Cartan’s magic formula (16.286). 


16.15 Space+time (3+1) split in multivector forms notation 


As discussed in §16.5.8, when applied to fields, the super-Hamiltonian approach does not yield equal numbers 
of coordinates and momenta. The problem arises because symmetry under general coordinate transformations 
means that different configurations of fields are symmetrically equivalent. To permit manifest covariance, 
the super-Hamiltonian formalism is forced to admit more fields than there are physical degrees of freedom. 
As found previously with the electromagnetic field, §16.6.6, the solution to the problem is to break general 
covariance by splitting spacetime into separate space and time coordinates. 

Executing a 3+1 split of the gravitational equations successfully, in the sense of achieving a balanced 
number of coordinates and momenta with the right number of physical degrees of freedom, is, unsurprisingly, 
a more complicated challenge than splitting the electromagnetic equations. 

In splitting a multivector form a into time and space components, it is convenient to adopt the notation 
of §16.6.6, generalized to multivector-valued forms. A multivector p-form a splits into a component a; 
(subscripted t) that represents all the coordinate time t parts of the form, and a component aa (subscripted 
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a) that represents the remaining spatial-coordinate components. The ¢ and @ subscripts should be interpreted 
as labels, not indices. Thus a multivector p-form a splits as 


a = a; + aa = asin Y^ de" + aaa yi drt , (16.304) 


implicitly summed over distinct antisymmetric sequences of indices. Note that only the coordinates are being 
split: the Lorentz indices are not split into time and space parts. The option of also splitting the Lorentz 
indices is explored further in §16.16, equation (16.328). 

The time component of a product (geometric product of multivectors, exterior product of forms) of any 
two multivector forms a and b satisfies 


(ab); = aba + aabz , (16.305) 


with no minus signs (minus signs from the antisymmetry of form indices cancel minus signs from commuting 
dt through a spatial form). The space component of a product of two multivector forms a and b satisfies 


16.15.1 3+1 split of the gravitational Lagrangian in multivector forms notation 


Consider first a 3+1 split of the standard gravitational Lagrangian (16.234). The gravitational coordinates 
in this case are the Lorentz connections [, and their conjugate momentum are the components of the line 
interval e. Actually, the momentum canonically conjugate to the Lorentz connection T in the standard 
gravitational Lagrangian (16.234) is the area element e?, but as remarked at the end of §16.14.5, if all 
components of the area element are considered independent, then variation of the action with respect to all 
those components does not lead to general relativity. The fix is to consider the area element to be a product 
e = se Ae, in which the physical degrees of freedom are contained in the line element e. 

After the space+time 3+1 split, the coordinates are the spatial components of the Lorentz connection Ta, 
which is a bivector 1-form with 6 x 3 = 18 components, and the momentum is the spatial line interval ea, 
which is a vector 1-form with 4 x 3 = 12 components, 


Te = La de® = Lha YE Ay dz , (16.307a) 
ea = ea d£% = eka Y? dx® . (16.307b) 


The mismatch between the number 18 of components of the spatial connection Ta and the number 12 
of components of the spatial line interval ea is problematic. Despite the mismatch, it is useful to pursue 
the approach further, because it leads to a set of constraint equations commonly called the Gaussian and 
Hamiltonian constraints. These constraints are analogous to the electromagnetic constraint (16.77a), which 
has the property that, if it is satisfied initially, then conservation of electric charge guarantees it there- 
after. Conservation of electric charge is a consequence of electromagnetic gauge symmetry. The Gaussian 
and Hamiltonian constraints are similarly constraint equations which, if satisfied initially, are guaranteed 
thereafter respectively by the conservation equations for spin angular-momentum and energy-momentum. 
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These conservation equations are in turn a consequence of symmetries under Lorentz transformations and 
coordinate transformations. 

The equations of motion (16.309) and constraint equations (16.310) follow directly from splitting equa- 
tions (16.250) into time and space parts, but they can be derived at a more fundamental level by splitting 
the variation 6S of the action into time and space parts. Splitting the variation 65, of the gravitational 
action, equation (16.247), into time and space parts gives 


ae (e2 A6D);— [fen] 


-> (e^ S);AdTs + ea A(eA R) + (e^ S)a ^rr + 0e; NEA R)a. (16.308) 


The two surface integrals are respectively over the timelike spatial boundary of the 4-volume from ¢; to tẹ, 
and over the two spacelike caps of the 4-volume at t; and tf. Variation of the combined gravitational and 
matter actions with respect to the variations ôl a and de, of the spatial coordinates and momenta yields 
the equations of motion, 


18 equations of motion: (e^ S);= 83; j (16.309a) 
12 equations of motion: (e^ R); = 8nT'; i (16.309b) 


These are just the coordinate time components of the equations of motion (16.250). Variation with respect 
to the variations dV‘; and de; of the time components of the coordinates and momenta yields the Gaussian 
and Hamiltonian constraints, 


6 Gaussian constraints: (eA S)q = SnD ; (16.310a) 
4 Hamiltonian constraints: (eA R)a = nT’ . (16.310b) 


These are the purely spatial coordinate components of the equations of motion (16.250). Whereas the equa- 
tions of motion (16.309) involve derivatives with respect to time t, the constraint equations (16.310) involve 
no time derivatives. More explicitly, the equations of motion (16.309) are 


18 equations of motion: eg A (diea + trz, ea] + daez + ta, e;]) +er^ Sa = sr; i (16.311a) 
12 equations of motion: ea ^ (dra +daľz +5 [a,r zl) +e;^ Ra = 81T; (16.311b) 


The exterior time derivative here is the 1-form d; = dt 0/0t. The equations of motion (16.311) are problematic 
not only because they remain unbalanced despite the 3+1 split, but also because the time derivative is not 
d; but rather ea A dz. 

Both T; and ez can be treated as gauge variables: the 6 components of I’; can be adjusted arbitrarily by a 
Lorentz transformation; and the 4 components of e; can be adjusted arbitrarily by a coordinate transforma- 
tion. Thus the Gaussian and Hamiltonian constraint equations (16.310) can be interpreted as representing 
conserved Noether charges. The spin angular-momentum Do on the right hand side of the Gaussian con- 
straint equation (16.310a) satisfies the conservation law (16.281). The energy-momentum Ta on the right 
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hand side of the Hamiltonian constraint equation (16.310b) satisfies the conservation law (16.299). The left 
hand sides of the Gaussian and Hamiltonian constraints satisfy corresponding conservation laws enforced by 
the contracted Bianchi identities (16.406). The Gaussian and Hamiltonian constraints are constraint equa- 
tions in the sense commonly used by relativists: if the equations are arranged to be satisfied on the initial 
spatial hypersurface of constant time, then the conservation equations ensure that the equations will continue 
to be satisfied thereafter. 


16.15.2 Conventional gravitational Hamiltonian 


The conventional Hamiltonian is not the same as the super-Hamiltonian. Whereas the super-Hamiltonian 
approach is fully covariant, and the super-Hamiltonian (16.236) is a pseudoscalar 4-form, the conventional 
Hamiltonian approach picks out the coordinate time dimension as special, and the conventional Hamiltonian 
is the time component of a different pseudoscalar 4-form, equation (16.318). 
Split into time and spatial components, the gravitational Lagrangian 4-form (16.234) is 
I 73 1 
a= (e2 A (dTa + do¥74+ 4[La, ra) +ez\(eAR)a) . (16.312) 
The eĉ Ad,V term in the Lagrangian (16.312) indicates that the momentum conjugate to the 18-component 
spatial connection I's is the 18-component spatial area element e2. But, as discussed in the §16.15.1 above, 
the spatial area element e2 has excess degrees of freedom compared to the 12-component line interval ea. The 
fix adopted in §16.15.1 was to regard the spatial line interval eg rather than the spatial area element as the 
conjugate momentum. Indeed, if all 18 degrees of freedom of the area element were treated as independent, 
then the Einstein equation (16.309b) would be replaced by an equation for Ra in place of (eA R)a, and 
the result would not be general relativity, contradicting observation and experiment. To treat T'a and ea as 
conjugate variables, the e2 Ad;I'q term may be rewritten 


e? AdiT's = 4 ea A (ea Adira) . (16.313) 


Equation (16.313) effectively replaces the time derivative d; with ea ^ dz, consistent with the time derivative 
in the equations of motion (16.311). The remaining terms in the gravitational Lagrangian (16.312) rearrange 
as follows. The d,I; term integrates by parts to 


e? \daT; = da (e2 AT?) — (de?)a AT; . (16.314) 
The 4[La, Iz] term rearranges by the multivector triple-product relation (13.39) to 
łe? ALa, Ti] = sle?, TaATr = 30, e7]a ATi. (16.315) 


The coefficients of the AT; terms in equations (16.314) and (16.315) are 


(—de? — 3{T,e7])_ = (e^ S)a , (16.316) 


a 
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where S is the torsion defined by equation (16.212). The manipulations (16.313)—(16.316) bring the gravita- 
tional action to 


s=-Ef e? aie $ ea A (ea ^da) + (e^ S)a ATi + e7A(eA R)a : (16.317) 
; T 

With the surface term discarded, the gravitational action (16.317) is in conventional Hamiltonian form 
Lg =I(pA(eAdiq)) — Hg with coordinates q = Ta and momenta p = —ea/(167), and a somewhat strange 
time derivative ea ^d:. The conventional (not super-) Hamiltonian 4-form Hy is 


I 

8r 
The conventional Hamiltonian (16.318) is a sum of the Gaussian and Hamiltonian constraint variables 
(e^ S)a and (eA R)a, equations (16.310), wedged with the gauge variables T; and ez. 

The Hamiltonian (16.318) is fine as a conventional Hamiltonian in which the coordinates and momenta are 
the 18-component spatial connection Ta and the 12-component spatial line interval ea. But the Hamiltonian 
cannot be satisfactory because it yields only 12 equations of motion (16.311b) for the 18 components of Fa, 
and because the time derivative in those equations is ea ^d; rather than d;. Ultimately, these problems stem 
from the fact that there remain redundant degrees of freedom in Ia despite the 3+1 split. 


Hg = —((eAS)a AT; + e7A\(eA R)a) . (16.318) 


16.15.3 3+1 split of the alternative gravitational Lagrangian in multivector forms 
notation 


A 3+1 split of the alternative Lagrangian (16.257) yields a more promising result: a balanced set of equa- 
tions of motion, and a time derivative that is just d; = 0/Otdt as opposed to ea Ad;. In the alternative 
Lagrangian, the gravitational coordinates are the line interval e, and their conjugate momenta are 7 defined 
by equation (16.253). After the 3+1 split, the coordinates are the spatial components ea of the line interval, 
which is a vector 1-form with 4 x 3 = 12 components, while the momenta are the spatial components ma, 
which is a trivector 2-form also with 4 x 3 = 12 components. 

Once again, the equations of motion (16.320) and constraints and identities (16.324) follow directly from 
splitting equations (16.266) into time and space parts, but they can be derived more fundamentally by 
splitting the variation ôS of the action into time and space parts. Splitting the variation ôS, of the alternative 
gravitational action (16.259) into time and space parts gives 


I 7 I ae i 
ôS, = af (mA de) + zz [ferne] + 5 f dma Si- Ia Ader töni A Sa — dea AT, (16.319) 


Variation of the combined gravitational and matter actions with respect to the variations de, and 67, of 
the spatial coordinates and momenta yields 12 + 12 = 24 equations of motion involving time derivatives, 


12 equations of motion: S; = 87%; , (16.320a) 
12 equations of motion: TI; = 81T; . (16.320b) 
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Variation of the action with respect to the variations de; and 67; of the time components of the coordinates 
and momenta yields 6 identities and 10 constraint equations involving only spatial derivatives, 


6 Gaussian constraints and 6 identities: Sa = 87a , (16.321a) 
4 Hamiltonian constraints: Ma = 81r Ts . (16.321b) 


The Gaussian constraints are the subset of equations (16.321a) comprising 
6 Gaussian constraints: (eA S)5 = 8r(e^Ñ)a . (16.322) 


More explicitly, the equations of motion (16.320) are 


12 equations of motion: diea + s (Vi, ea] + daez + (Va, ej = 84d: , (16.323a) 
12 equations of motion: dima + ¿[Lz ma] + dam: + 4a, wa — 4 (EAT); = 8nT;, , (16.323b) 
and the constraints and identities (16.321) are 
6 Gaussian constraints and 6 identities: (de + 4[P,e]). = 8ra , (16.324a) 
4 Hamiltonian constraints: (dr + 4[L, 7] — feA(r, Tl). = 8nT; . (16.324b) 
The Gaussian constraints (16.322) are 
6 Gaussian constraints: (de? — e- n)a =8n(eAD)a . (16.325) 


Equations (16.323a) comprise 12 equations of motion for the 12 coordinates ea, while equations (16.323b) 
comprise 12 equations of motion for the 12 momenta ma. The equations of motion (16.323) do not suffer 
from the peculiarities of the earlier equations of motion (16.311): the time evolution operator is d; = dt 0/0t; 
and the number of equations of motion matches the number of dynamical variables. 


16.15.4 Gravomagnetic field 


To solve the system of gravitational equations (16.323) and (16.324), it is necessary to isolate the 6 identities 
from the 6 Gaussian constraints in equation (16.324a). Whereas constraint equations can be discarded after 
being imposed in the initial conditions (because conservation laws ensure their ongoing satisfaction during 
subsequent evolution), identities must be calculated at each time step. The 6 identities are equations (16.342) 
below. 

The spatial Lorentz connection la has 18 components, whereas its contraction the spatial momentum 
ma = —(€AT)a, equation (16.253), has only 12. The extra 6 components of the spatial Lorentz connection 
are redundant. The 6 identities (16.324a) can be interpreted as defining the 6 redundant components of 
the spatial Lorentz connection Ta in terms of spatial exterior derivatives daea of the spatial line interval 
ea. These 6 redundant components, denoted Pa (with a slash), can be called the gravomagnetic field, 
equation (16.334), since the situation is analogous to that in electromagnetism, where the 3-component 
magnetic field is redundant because it can be replaced by the spatial exterior derivative of the spatial 
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components of the electromagnetic potential, equation (16.77b). Although the gravomagnetic field Fa is 
redundant, it must still be calculated because the equations of motion (16.323) depend on it. 

To isolate the gravomagnetic field from the other components of the spatial Lorentz connection Ta, let y? 
denote the future-pointing tetrad vector normal to spatial hypersurfaces. The ADM formalism, Chapter 17, 
makes the gauge choice of imposing that the tetrad time vector y? be normal to spatial hypersurfaces. Choos- 
ing the vector 4° to be normal to spatial hypersurfaces is equivalent to imposing the ADM conditions (17.7) 
on the vierbein and its inverse, 


ene, =o, (16.326) 


However, in the present context the vector y’ should be interpreted as the future-pointing normal to spatial 
hypersurfaces, regardless of whether it happens also to be the tetrad time vector. The normal to spatial 
hypersurfaces is related by a Lorentz boost to any arbitrary tetrad time vector. In what follows, the vector 
~? will be referred to as the tetrad time vector, and 0 as the tetrad time index, on the grounds that 4° is 
timelike while the three vectors y%, a = 1,2,3, orthogonal to it are spacelike, regardless of whether 4° is or 
is not the chosen tetrad time axis. The point of requiring y° to be normal to spatial hypersurfaces is that 
spatial tetrad and coordinate indices can then be transformed freely between each other using the spatial 
vierbein efa and its inverse e,%, 


eala = Enak = Ag, Ca Aa = Cg" Gx = Qa - (16.327) 


The extension of the sum over 3 spatial indices a (or a) to 4 spacetime indices k (or x) in equations (16.327) 
is thanks to the conditions (16.326), which hold as long as y? is normal to the spatial hypersurface, as is 
being required. 

It is convenient to extend the 3+1 form-splitting notation (16.304) to a double 3+1 split in which tetrad 
(Lorentz) indices as well as coordinate indices are split out. Thus a multivector form a splits into 4 compo- 
nents ağ, Goa, Qaş, and aaa that represent respectively the time-time, time-space, space-time, and space- 
space components of the multivector form, 


a = agi + aga + Gaz + aaa = agara YAY aa" + aoar YP AYI dr^ + aan Yi da! + any’ Pah , 
(16.328) 
implicitly summed over distinct antisymmetric sequences of tetrad and coordinate indices A and A. In the 
notation (16.328), the ADM gauge condition (16.326) is 


eg =0. (16.329) 
The 18-component spatial Lorentz connection Ta splits into 9+9 components, 
Ts = Iga +Iaa = (Towa Y + Taba Y°) AY? de® . (16.330) 


The time 0 tetrad components loba are part of what is commonly called the extrinsic curvature, Ky, = 
Toon = —Dovx, §17.1.4, while the spatial a tetrad components Paba are part of what is referred to elsewhere 
in this book as the restricted connection Tao, = Vao,, §17.1.5. The 9-component extrinsic curvature Iga is 


472 Action principle for electromagnetism and gravity 


invertibly related to the 9-component momentum, Tga 
Ta == Caa A Toa ; (16.331) 


which holds thanks to the ADM condition (16.329). The 9-component all-spatial Lorentz connection Taa 
resolves into a 3-component spatial trace (16.333), and a 6-component trace-free part, the gravomagnetic 
field Faa, equation (16.334), 


Pia =- leza A^ Taa +Paa. (16.332) 


The slashed notation Faa for the 6-component gravomagnetic field symbolizes that it is trace-free, and also 
that it is the part of the 18-component spatial Lorentz connection Tg not contained in the 12-component 
spatial momentum ma. The 3-component spatial trace of the all-spatial Lorentz connection is 


Trl ga = eT tay’ =T Ry = —T3, (16.333) 


ba aa? 


where the vector 0-form 7l- is the transpose of the spatial double dual of the all-spatial momentum mza. 


The spatial trace (16.333) is to be distinguished from the spacetime trace (16.256); the latter includes an 
additional contribution from Ig. The 3-component all-spatial momentum maa may be called the BSSN 
variable, because the equation of motion for this variable is the key equation that distinguishes the BSSN 
formalism, §16.16.2, from the ADM formalism, §16.16.1. 


The 6-component trace-free part of the all-spatial Lorentz connection defines the gravomagnetic field Faa, 
| = Taa + leza TAN Tla = Taba y’ Ag? dz“ = (Taba = €aal’y,,) y? Ay? dz“ : (16.334) 


The 6 identities that define the gravomagnetic field Faa are part of the 12-component expression (16.321a) 
for the spatial torsion Sq in terms of the spatial spin angular-momentum “ig. The 12-component spatial 
torsion Sa splits into 3+9 components (the minus sign conforms to Cartan’s convention, equation (16.209)), 


Sa = Soa + Saa = — (Soap Y + Saag Y) x? . (16.335) 
The 3-component time 0 tetrad part Sga is invertibly related to the 3-component (eA S)oa, 
(e ^ S)oa = eaa ^ Soa , (16.336) 


so the equation for Sga is part of the Gaussian constraints (16.322). The 9-component all-spatial torsion Saa 
resolves into a 3-component trace (16.338) and a 6-component trace-free part Saa (the slashed notation Saa 
symbolizing that it is trace-free), 


Saa = feaa ^ Saa + Baa - (16.337) 
The 3-component spatial trace of the all-spatial torsion is, compare equation (16.131a) (the minus signs are 
Cartan, again), 


Tr Saa = -e Saag dz“ = — S$ d° = -3a , (16.338) 


where the scalar 1-form $1- is the transpose of the spatial double dual of the spatial bivector 3-form saa 


defined by 
Saa = (e^ S)aa . (16.339) 
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The equation for Saa is part of the Gaussian constraints (16.322). The components of the double dual S and 


: ek 
its transpose S! are 


Saa = Syt, Ela =S, dr” . (16.340) 
The 6-component trace-free part $a of the all-spatial torsion is, equation (16.337), 
Baa = Saa — leza ^ Boo — = faang y| PrP = —(Saap + EaaS 3y) yl aP (16.341) 
The 6 identities in equations (16.321a) are, finally, 
6 identities: aa = 81TËaa - (16.342) 
More explicitly, the 6 identities (16.342) are 


6 identities: (ge +4[F,e]).. = 8TŽza (16.343) 


where deaa is the trace-free part of the all-spatial exterior derivative deaa of the line interval. Equa- 
tion (16.342) defines the gravomagnetic field 7, in term of the spin angular-momentum ¥igq and spatial 
derivatives of the line interval. Note that [P, e]aq is invertibly related to Pza, 


Y, eļaa = ae eaa = Vava(e’s x" = e'g 7’) PrP = —2F aag y* degel ; (16.344) 


Nie 


16.15.5 Alternative conventional Hamiltonian 


The conventional Hamiltonian is not the same as the super-Hamiltonian. The conventional Hamiltonian was 
discussed for the standard gravitational Lagrangian (16.234) in §16.15.2. The present section considers the 
conventional Hamiltonian for the alternative gravitational Lagrangian (16.257). 

Splitting the alternative Lagrangian Ly, equation (16.257), into time and space components, and rearrang- 
ing along lines similar to those leading to the gravitational action (16.317), brings the alternative gravitational 
action S% to 


S’ : 


te I 
-2f mahert gs | TaAdiea + mA Sa — Ia ner. (16.345) 
E 8r Ji, 8r 


With the surface term discarded, the gravitational action (16.345) is in conventional Hamiltonian form with 
coordinates ea and momenta ma/(8r). The alternative conventional (not super-) Hamiltonian H¢ is 


T 


Part of deriving equation (16.346) involves proving that 
(ea AT;) A(e ‘Dg = (eAT)a Alea g Tz) . (16.347) 


The alternative conventional Hamiltonian (16.346) is a sum of constraint and identities variables Sa and IIa, 
equations (16.321), wedged with time components e; and 7; of the coordinates and momenta. Whereas the 
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standard conventional Hamiltonian (16.318) depended only on constraint and gauge variables, the alternative 
conventional Hamiltonian (16.346) depends in addition on the gravomagnetic field Faa- 

In contrast to the conventional Hamiltonian (16.318), the alternative conventional Hamiltonian (16.346) 
accomplishes the goal of a balanced number, 12 each, of coordinates and momenta. 


16.15.6 WEBB formalism 


The system of 24 Hamiltonian equations of motion (16.323) is a set of coupled first-order partial differential 
equations. The system is integrable, but integrability does not guarantee that their numerical integration is 
stable. A set of coupled partial differential equations is numerically stable if they are strongly hyperbolic, as 
described in §17.7.1. 

The thing that complicates the analysis of the hyperbolicity of the equations of motion (16.323) is that 
they involve not only the coordinates and momenta ea and ma and their first derivatives, but also the gravo- 
magnetic field za, which itself depends on spatial derivatives da of the coordinates ea, equation (16.343). 
The term darz in the Einstein equations (16.323b) then includes some second-order spatial derivatives of 
ea, while the terms Ha, Tz] and ter A[Pa, Ta] include terms quadratic in spatial derivatives of ea. 

The difficulty can be overcome by promoting the gravomagnetic field Fa to a set of 6 independent 
variables governed by their own equation of motion. The operation of promoting derivatives of variables to 
independent variables and enlarging the system of differential equations is called prolongation. The system 
obtained by prolonging the gravomagnetic field is the WEBB formalism (Buchman and Bardeen, 2005), a 
system of tetrad-based equations proposed by Buchman and Bardeen (2003) based on the work of Estabrook, 
Robinson, and Wahlquist (1997). Buchman and Bardeen (2003) prove that the WEBB system is strongly 
hyperbolic for at least some prescriptions for the gauge variables ez and Tz. 


aa 


The 6 equations of motion governing the prolonged gravomagnetic field Faa are 
6 equations of motion: (d$)gz = 87(d¥)ar . (16.348) 


Since the second exterior derivative vanishes identically, d? = 0, and the trace-free all-spatial torsion is given 
by the left hand side of equation (16.343), the equations of motion (16.348) reduce to 
6 equations of motion: (d3[e, F])-; = 87(d¥)az . (16.349) 


The original 6 identities (16.342) become constraints, because although they must be arranged to be satisfied 
on the initial spatial hypersurface, they are guaranteed thereafter by the Bianchi identity (16.403a), 


6 gravomagnetic constraints: $75 = 87Yaa . (16.350) 


In all, after the gravomagnetic field is prolonged, the original 40 Hamiltonian equations (16.320) and (16.321) 
become 46 equations, consisting of 30 equations of motion, 16 constraints, and zero identities. 
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16.16 ADM gauge condition 


Section 16.15.1 rejected the possibility of treating the 18-component spatial connection Ta as the gravitational 
coordinates, and the 18-component spatial area element e? as their conjugate momenta, on the grounds that 
the area element contains excess degrees of freedom compared to the 12-component spatial line interval ea. 
However, the idea of working with T'a and eĉ, as opposed to ea and Ta, is attractive, firstly because in gauge 
theories such as electromagnetism the coordinates are the connections A, §16.6, the (Lorentz) connections of 
gravity are T, and secondly because black hole thermodynamics points to area as the thing that is somehow 
quantized in general relativity. 

One way to reduce the excess degrees of freedom in the area element is to impose gauge conditions on the 
spatial line interval ea. A natural strategy is to impose the 3-component ADM gauge condition eĝa = 0, 
which was invoked earlier, equation (16.329), to separate out the 6 identities of the Hamiltonian system of 
equations, §16.15.4. The gauge choice (16.329) is the starting point of the ADM formalism, Chapter 17, and 
is carried over into the BSSN formalism, §17.8. The gauge choice (16.329) is also a basic ingredient of Loop 
Quantum Gravity, §??. 

The ADM gauge condition (16.329) reduces the number of degrees of freedom of the spatial line interval 
ea from 12 to 3 x 3 = 9, and of the spatial area element e? from 18 to the same number, 3 x 3 = 9. The 
9 components of the spatial line interval and spatial area element subject to the ADM gauge condition are 
invertibly related to each other. The spatial area element e?, is the 9-component bivector 2-form 


e2, = 4 (€Ae)aa = 2eaneng YAY Pa” . (16.351) 


The momenta conjugate to the spatial area element eĉ, are the 3 x 3 = 9 components of the spatial Lorentz 
connections lõa with one Lorentz index the tetrad time index 0, also called (minus) the extrinsic curvature, 
$17.14, 


(ality AY da” . (16.352) 


It looks as though the goal of having the coordinates and conjugate momenta be the connection T'a and 
area element e2 has been achieved, but notice this success has been won by trickery. The ADM gauge 
choice (16.329) is a condition on the line element e, not on the area element e?. Imposing the ADM gauge 
condition still requires that the area element be a product e? = že Ae of the line element e. 

Double-splitting the variation 6S, of the gravitational action, equation (16.247) into time and space parts 
gives 

I te te 
ÔSg = (e? ^ ôT)oz — [FAT] 


8T Ji i 


I 
E fie A Sor A dV aa + (e^ S)ar A OV Ga + (e^ S)oa A OV az + (CA S)aa ASIO 
+ eaa Ale A^ R)oz + dega Ale A R)az + dear Ale A^ R)oa + deoz Ale ^ R)aa - (16.353) 


Variation of the combined gravitational and matter actions with respect to the variations 615, and eaa 
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yields 9 + 9 = 18 equations of motion for the area element eĉ, and their conjugate momenta Iga, 


ek 


9 equations of motion: (eA S)ar = 81a; , (16.354a) 


ek 


9 equations of motion: (eA R)oz = 8nT 5; , (16.354b) 
while variation with respect to ĝega yields the 3 equations of motion 
3 equations of motion: (e^ R)az = 8*T 3; . (16.355) 


Note that the ADM gauge condition eĝa = 0 is a gauge condition, fixed after equations of motion are 
derived, so it is correct to vary ega in the action, leading to equation (16.355). Explicitly, the equations of 
motion (16.354) and (16.355) are, similarly to equations (16.311), 


9 eqs. of motion: — d:e2, + 5 (Var, e2 | — eaa A (daear + 5 [Vac ear]) + ear ^ Saa = 8n Dar , (16.356a) 


9 eqs. of motion: eaa ^ (dil Sa + [Laz Tos] + dolor + $[Taa, Toz) + eat ^ Roa = SrTor , (16.356b) 


3 eqs. of motion: eaa \(diTaa + doV et Laa, Vi) tear Raa =8rT gz. (16.3560) 


Equation (16.355) is an equation of motion in the sense that it involves a time derivative d;Tga; but Taa 
is not one of the momenta Iga conjugate to the area element e2,, so equation (16.355) has a different 
status from the 9 + 9 equations of motion (16.354). In the ADM formalism, §16.16.1, equation (16.355) 
is discarded as redundant with the 3 momentum constraints (16.357d), on the grounds that the energy- 
momentum tensor is symmetric (for vanishing torsion). The BSSN formalism on the other hand, §16.16.2, 
retains equation (16.355) as a distinct equation of motion. 

The earlier equations (16.311) had the problem that the time derivative in the equation of motion for 
Ta was eg Ad; as opposed to just d;. Equation (16.356b) seems to have the same difficulty, but here it 
is no longer a problem, because the 9-component trivector 3-form eaa A d:Vpq is invertibly related to the 
9-component bivector 2-form d;I'9,, so equation (16.356b) can be rearranged as an equation for dIa- 

Variation of the action with respect to 6[ga, laz, 6157, d6egz and 6e5; yields 9 identities and 10 constraints 
involving only spatial derivatives 


9 identities: (eA S)57 = 87d5;z , (16.357a 

3 Gaussian constraints: (eA S)5q = 8 D55 ; (16.357b 

3 Gaussian constraints: (e^ S)aa = Sraa , (16.357c 

3 momentum constraints: (eA R)jq = 8nT oa ; (16.357d 
1 Hamiltonian constraint: (eA R)aa = 81T aa - (16.357e 


The 9 identities (16.357a) are not equations of motion (they involve no time derivatives), despite having a 
form index t. Explicitly, 


9 identities: 1 [Loz e24] + }[L0a, e24] = 81% . (16.358) 
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16.16.1 ADM formalism 


The previous section 16.16 explored the form of the Hamiltonian system of equations when the ADM gauge 
condition (16.329) is imposed, and the coordinates and momenta are taken to be the extrinsic curvature Tga 
and the spatial area element e2.. However, the traditional ADM formalism goes further than just imposing 
a gauge condition. The ADM formalism is pursued at length in Chapter 17 in traditional (coordinate and 
tetrad) index notation. Here it is useful to offer a few comments on the ADM formalism in the present 
context of multivector forms notation. 

The ADM formalism imposes the ADM gauge condition (16.329), ega = 0, from the outset, reducing 
the degrees of freedom of the spatial line interval e, from 12 to 9. The ADM formalism further assumes 
from the outset that torsion vanishes. One of the consequences of vanishing torsion is that the energy- 
momentum tensor is symmetric, equation (16.282). This motivates the ADM strategy of simply discarding the 
6 antisymmetric components of the Einstein equations, the 6 antisymmetric components of the 12 equations 
of motion (16.323b). Discarding the antisymmetric Einstein’s equations seems innocent enough, until one 
realises that antisymmetric part of the energy-momentum tensor is a source in the law of conservation of 
spin angular-momentum, equation (16.281), which law is responsible for the 6 Gaussian constraints (16.322). 
Thus discarding the 6 antisymmetric Einstein equations is equivalent to using up the 6 Gaussian constraints. 
As a corollary, the 6 Gaussian constraints can no longer be treated as constraints; rather, they must be 
treated as identities. 

Finally, the usual ADM strategy (though not a necessary one — see Chapter 17), is to work entirely with 
coordinate-frame quantities. An advantage of this approach is that all quantities are spatially Lorentz gauge- 
invariant (the ADM gauge choice (16.329) removes the gauge freedom of Lorentz boosts). In particular, the 
9 components eaa of the spatial line interval reduce to the 6 components of the Lorentz gauge-invariant 
spatial metric gag, and the 24 Lorentz connections are replaced by the 6 components of the symmetric (for 
vanishing torsion) extrinsic curvatures laog together with the 3 x 6 = 18 torsion-free coordinate-frame spatial 
connections (Christoffel symbols) Tagy- 

In all, in the ADM formalism there are 6 + 6 = 12 equations of motion for gag and Tog, 18 identities for 
Tagy, and 4 Hamiltonian constraints, a total of 34 equations altogether. The 6 equations lost compared to 
the 40 of the Hamiltonian system (16.320) and (16.321) are the 6 antisymmetric Einstein equations. 


16.16.2 BSSN formalism 


The BSSN formalism, discussed further in Chapter 17, §17.8, has gained popularity because it is strongly 
hyperbolic, and therefore has better numerical stability when applied to problems such as the merger of two 
black holes. 

The BSSN formalism follows ADM for the most part, in particular imposing the ADM gauge choice (16.329). 
However, instead of discarding all 6 antisymmetric components of the 12 Einstein equations (16.323b), BSSN 
retains the 3 antisymmetric components Iaz, which govern the evolution of the 3-component all-spatial mo- 
mentum Taa, equation (16.333). BSSN thereby keeps 3 Gaussian constraints, the ones governing the evolution 
of the 3-component all-spatial contracted torsion (eA S)aa equation (16.339). 
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In all, the BSSN equations constitute 6 + 6 + 3 = 15 equations of motion for the spatial metric gaps, 
the (symmetric, for vanishing torsion) extrinsic curvature laog, and the BSSN variable rẹ, (eq. 16.333), 15 
identities for the torsion-free coordinate-frame connections lagy, 4 Hamiltonian constraints, and 3 Gaussian 
constraints, a total of 37 equations altogether. The 3 equations lost compared to the 40 of the Hamiltonian 
system (16.320) and (16.321) are the 3 antisymmetric spatial Einstein equations. 


Exercise 16.12. Gravitational equations in arbitrary spacetime dimensions. In multivector forms 
language in N spacetime dimensions: 
1. What is the Hilbert gravitational Lagrangian? What is the gravitational super-Hamiltonian? 
What is the variation of the gravitational Lagrangian? 
What are the gravitational equations of motion? 
What is the space+time (N—1)+1 split of the gravitational equations of motion? 
What is the alternative Hilbert gravitational Lagrangian? 
What is the variation of the alternative gravitational Lagrangian? 
What is the space+time (N—1)+1 split of the alternative gravitational equations of motion? 
What is the space+time (N—1)+1 split of the gravitational equations of motion when the ADM gauge 
condition (16.329) is imposed? 
Solution. 


00- ST oa aa B3 


1. The Hilbert gravitational Lagrangian in N spacetime dimensions is the scalar N-form, generalizing 
equation (16.234), 


I 
Le =—-— eR, (16.359) 
AN 
where Iy is the N-dimensional spacetime pseudoscalar, and ky is Newton’s gravitational constant, 
suitably normalized, in N spacetime dimensions. The Lagrangian (16.359) is in super-Hamiltonian form 
I 
Lg =- e2 Adr — Hg , (16.360) 
KN 
with super-Hamiltonian, generalizing equation (16.236), 
I 
H, = > e- Afr, T]. (16.361) 
AKN 
2. The variation of the gravitational action in N spacetime dimensions is, generalizing equation (16.247), 
I I 
a ee f eo Ar — Æ fer AS)A\6P +6eA(eX 3A R) . (16.362) 
KN KN 
The variation of the matter action is defined by equations (16.248) in any spacetime dimension. With 
matter, the equations of motion generalizing equations (16.250) are 


1 N?(N — 1) equations of motion: e%-? AS = «y, (16.363a) 


N? equations of motion: eY-3,R=kKyT. (16.363b) 
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The pseudobivector pseudo 1-form set of equations (16.363a) governing the torsion have the same num- 
ber of components as the vector 2-form torsion S defined by equation (16.209), so completely determine 
the torsion in terms of the spin angular momentum &. Thus in N spacetime dimensions, as in 4 space- 
time dimensions, torsion vanishes in empty space, and does not propagate. By contrast, the pseudovector 
pseudo 1-form set of equations (16.363b) constitute N? equations governing the (4 N(N — 1))° com- 
ponents of the bivector 2-form Riemann curvature R defined by equation (16.205); the equations of 
motion (16.363b) determine only the contracted components of the Riemann tensor. The remaining 
F(N + 1)N?(N — 3) components of the Riemann tensor are governed by Bianchi identities, §16.17. In 
N > 3 spacetime dimensions, as in 4 spacetime dimensions, Riemann curvature does not vanish in empty 
space, but rather propagates as a wave. 

. Split into time and space parts, the spacetime equations of motion (16.363) split into equations of motion 
that involve time derivatives d, of the N(N — 1) spatial momenta e and 4N(N — 1)? spatial coordinates 
T, generalizing equations (16.309), 


3N(N — 1)? equations of motion: (e7? A S$); = Kn $, (16.364a) 
N(N — 1) equations of motion: (e~? A^ R); = kN T; ; (16.364b) 


and purely spatial constraint equations involving no time derivatives d;, generalizing equations (16.310), 
3N(N — 1) Gaussian constraints: (e7? A S)a = fN Ya , (16.365a) 


N Hamiltonian constraints: (e~? A R)a = «y Ta . (16.365b) 


. The alternative Hilbert gravitational Lagrangian in N spacetime dimensions is, generalizing equa- 
tion (16.257), 


I 
i= (=)"1 mde — Hg , (16.366) 


where m is momentum pseudovector (IV—2)-form, generalizing equation (16.253), 
m = (—)N eN SAT, (16.367) 


and H, is the same super-Hamiltonian (16.361) as before. 
. The variation of the alternative action (16.366) is, generalizing equation (16.259) 


I 
6S, = — mise+(- NX [ona 8 +ITASe, (16.368) 
KN KN 


where the curvature pseudovector (N—1)-form II is, generalizing equations (16.260), 
Wee’? AR-—eX *ASAT=dr+3[P, a] +fe% MALT). (16.369) 


Note that the e74 term in the middle expression vanishes for N = 3. The variation of the matter 
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action is defined by equations (16.261) in any spacetime dimension. With matter, the equations of 
motion generalizing equations (16.266) are 


+N?(N — 1) equations of motion: S=knd, (16.370a) 
N? equations of motion: I=«yT. (16.370b) 


. Split into time and space parts, the alternative spacetime equations of motion (16.370) split into equa- 


tions of motion that involve time derivatives d; of the N(N — 1) spatial coordinates e and N(N — 1) 
spatial momenta m, generalizing equations (16.320), 


N(N — 1) equations of motion: S$; = «yN z, (16.371a) 
N(N — 1) equations of motion: Ty = Ky T, (16.371b) 


and purely spatial equations involving no time derivatives d;, generalizing equations (16.321), 


3N(N — 1)(N — 3) gravomagnetic identities: Sia = KN Yaa (16.372a) 
3N(N — 1) Gaussian constraints: (e%~* A S)a = ny (e7? A Bia. (16.372b) 
N Hamiltonian constraints: Il; = ky Ñ. (16.372c) 


Prolonging the gravomagnetic field Fa replaces the identities (16.372a) by the same number each of 
equations of motion and constraints, generalizing equations (16.348) and (16.350), 
5N(N — 1)(N — 3) equations of motion: (d3[e, F])_, = £n (d¥)az , (16.373a) 


5N(N — 1)(N — 3) gravomagnetic constraints: (eNA S)a = ny (eA Ña . (16.373b) 


. ADM imposes the N — 1 ADM gauge conditions ega = 0, equation (16.329), reducing the number of 


degrees of freedom of the spatial line-element e to (N—1)?, and likewise the number of degrees of freedom 
of the spatial area element e~? to (N—1)?. The momenta conjugate to the spatial area element are 
—T5a, again with (N—1)? degrees of freedom. There are 2(N—1)? equations of motion for the spatial 
area element and their conjugate momenta, generalizing equations (16.354), 


(N — 1)? equations of motion: (e~? A $)q; = KN Er, (16.374a 
(N — 1)? equations of motion: (e~? A R)o; = KN To : (16.374b 


There are a further N—1 equations of motion that are discarded in the ADM formalism (incidentally 
demoting the Gaussian constraints (16.376c) from constraints to identities) but retained in the BSSN 
formalism, generalizing equation (16.355), 


N — 1 equations of motion: (e~? A R)at = KN Tar ; (16.375 


The remaining equations, containing no time derivatives d;, comprise 4(N—1)?(N—2) identities and 
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$N(N-+1) constraints, generalizing equations (16.357), 


3(N — 1)?(N — 2) identities: (eù? A S)or = kn Yor , (16.376a 

3(N —1)(N — 2) Gaussian constraints: (e%~3 A 8)oq = KN Soa ; (16.376b 
N — 1 Gaussian constraints: (e7? A S)aa = KN Soa R (16.376c 

N — 1 momentum constraints: (e7? A R)o; = KN Toa f (16.376d 

1 Hamiltonian constraint: (e73 A R)aa = KN Taa ; (16.376e 


Exercise 16.13. Volume of a ball and area of a sphere. What is the volume Vy of a unit N-ball, and 
the area Sy of a unit N-dimensional sphere? A unit N-ball is the interior of a unit (N—1)-sphere, and an 
N-sphere is the boundary of a unit (N+1)-ball. 

Solution. The volume Vy of an N-ball is the area Sy- R-t! of an (N—1)-sphere of radius R integrated 
over R from 0 to 1, 


1 
Vn = Sw-1 | Rtg (16.377) 
0 
yielding 
SN- 
Vn = x 1 (16.378) 


The volume of an N-ball is also the volume Vy_1r%~! of an (N—1)-ball of radius r = sin 0 integrated over 
height z = cos@ from —1 to 1, 


1 T 
Vn = w f r-l dz = vwa f sin’ 6 dé. (16.379) 
—1 0 


The integral S sin’ @ d0 can be expressed in terms of I functions. Iterated twice, equation (16.379) gives 
the recurrence relation 


2r Vy— 
Vy = = (16.380) 
Equations (16.378) and (16.380) imply 
SN = 2T VN-—1 . (16.381) 


Initial values of the recurrence are V; = 2 and V2 = 7. General expressions for the volume and area are 


N/2 P On (N+1)/2 
Ve = ee SS eee 16.382 
rior) ara 
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16.17 Bianchi identities in multivector forms notation 


The Bianchi identities, equations (16.403) below, are differential identities satisfied by the torsion § and 
Riemann R tensors. The Bianchi identities are identities in the sense that if the torsion and Riemann 
tensors are expressed in terms of the line interval e and Lorentz connection T in accordance with Cartan’s 
equations (16.212) and (16.208), then the Bianchi identities are satisfied automatically. The contracted 
Bianchi identities (16.406) enforce conservation laws on the total spin angular-momentum © and matter 
energy-momentum T. 

In this section the number N of spacetime dimensions is arbitrary. The caret on various symbols in this 
section, such as Î, equation (16.386), signifies that they are operators; Î should not be confused with the 
restricted Lorentz connection În; considered in the next Chapter, equation (17.27). 


16.17.1 Covariant exterior derivative of a multivector form 


The exterior derivatives d of the multivector forms T and e in equations (16.206) and (16.210) were applied 
to the coordinate indices, but not to the tetrad indices. A covariant exterior derivative D, distinguished like 
the coordinate exterior derivative d by latin font, can be defined that is covariant not only with respect 
to coordinate transformations but also with respect to Lorentz transformations. In this context, covariance 
means that D commutes with both coordinate and Lorentz transformations. There is a torsion-free covariant 
exterior derivative D, and a torsion-full covariant exterior derivative D. 

If a is a multivector p-form, then its torsion-free covariant exterior derivative Da is asum of the coordinate 
exterior derivative plus a torsion-free Lorentz connection term, equation (15.4), 


Da=da+Ta, (16.383) 
where the torsion-free Lorentz connection operator Tr acting on the multivector form a is, equation (15.19), 
Da = LT, a] ; (16.384) 


with T = ieee yE Ay! dx" the torsion-free bivector 1-form, the torsion-free version of equation (16.201a). 
The torsion-full covariant exterior derivative Da is a sum of the coordinate exterior derivative plus a 
torsion-full Lorentz connection term plus a torsion term, 


Da =da + Îa + Sa}, (16.385) 


where the Lorentz connection operator Î acting on the multivector form a is, equation (15.19), 
Îa = }[L,a] . (16.386) 


The torsion-full Lorentz connection bivector 1-form I, equation (16.201a), is as usual the sum of the torsion- 
free Lorentz connection T and the contortion K, equations (11.55), 


T=T+K. (16.387) 
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The coordinate exterior derivative d is a torsion-free covariant curl, so the torsion operator Í in equa- 
tion (16.385) must be included when torsion does not vanish, as for example in equation (2.71). The torsion 
operator S$ is essentially the antisymmetric part of the coordinate connection, which is the only part of the 
coordinate connection that survives in a (covariant) exterior derivative. The torsion operator Í acts only 
on the coordinate indices of the form a, while the Lorentz connection operator Î acts only on the tetrad 
indices of the multivector a. If a = ayy d?x*" is a multivector p-form, with implicit summation over distinct 
antisymmetric sequences AII of p indices, then the torsion term Sa is the multivector (p+1)-form defined by 


p(p + 1) 


Sa = 3 


Shaun PTN, (16.388) 
implicitly summed over distinct antisymmetric sequences KAI of p+ 1 indices. If a is a 0-form (a coordinate 
scalar), then the torsion term vanishes, Sa = 0. In components, the covariant exterior derivative Da, 
equation (16.385), of the multivector p-form a is the (p + 1)-form 


Da = (p + 1)D,ayn PHA! = (p + 1) (Qcayn + EEk, ann] + 4p Sh apn) Pa" (16.389) 


with the implicit summation over distinct antisymmetric sequences KAI of p + 1 indices. 

The covariant exterior derivative D (in both torsion-free and torsion-full versions) acting on the product 
of a multivector p-form a and a multivector q-form b satisfies the same Leibniz-like rule as the exterior 
derivative d, equation (15.71), 


D(ab) = (Da)b + (—)?a(Db) . (16.390) 


16.17.2 A third, Lorentz-covariant, exterior derivative 
A third exterior derivative that is Lorentz-covariant but not coordinate-covariant crops up often enough 
to warrant a special notation. The Lorentz-covariant derivative Dr, subscripted I as a reminder that it is 
covariant only with respect to Lorentz indices, is 


Dead+r, (16.391) 


which is torsion-free acting on coordinate indices, and torsion-full acting on multivector indices. The Lorentz- 
covariant derivative D, satisfies the same Leibniz-like rule (16.390) as the other exterior derivatives. 

The derivative D; is not coordinate-covariant in the sense that it does not commute with the vierbein, 
that is, acting on the line interval e, it yields the torsion S, equation (16.393), 


De=S. (16.392) 


However, the derivative Dp satisfies other conditions for being a covariant derivative: it yields a (coordinate 
and tetrad) tensor when acting on a (coordinate and tetrad) tensor. 
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16.17.3 Torsion from the covariant exterior derivative 
By construction, the covariant exterior derivative D (in either torsion-free or torsion-full versions) commutes 
with both coordinate and Lorentz transformations. Thus the covariant exterior derivative of the line-element 


e defined by equation (16.201b) vanishes, De = 0. Applied to the line interval e, equation (16.385) recovers 
the definition (16.212) of the torsion vector 2-form S, Cartan’s first equation of structure, 


0=De=de+3[l,e]-S, (16.393) 
since Se is just minus the vector 2-form torsion S, 
Se = SÉ emu Ca" = Smr Y” te" = S . (16.394) 
The torsion-free version of Cartan’s equation (16.393) is de + LÈ, e] = 0. Subtracting this from the 
torsion-full Cartan’s equation (16.393) yields the relation between the torsion § and the contortion K, 
S=3|K,e]. (16.395) 


Equation (16.395) can be inverted to yield K in terms of S. The relation between torsion and contortion 
was given previously in index notation as equations (11.56). 


16.17.4 Riemann curvature from the covariant exterior derivative 


Whereas the square of the coordinate exterior derivative vanishes because of the commutation of coordinate 
partial derivatives, dd = 0, the square of the covariant exterior derivative does not vanish. In components, 
the square DD is 


DD = [D,,, Di] da , (16.396) 


implicitly summed over distinct antisymmetric pairs of indices «A. Acting on any multivector form a, the 
square of the covariant exterior derivative gives (compare equation (15.21)) 


DDa = Ra+SDa. (16.397) 


If a = a,n dx" is a multivector p-form, then the Riemann operator R acting on a is the (p + 2)-form (the 
D,S term in the following equation was given previously in components by equation (11.69)) 


(p T Hp T 2) (A[Rxa, apl + ip Ria’ avn) et 2g trv (16.398) 
implicitly summed over distinct antisymmetric sequences KA II of p+ 2 indices. The components of the Rie- 
mann tensor are those of the Riemann bivector 2-form R = Rp) d?x**, equation (16.205). Equation (16.398) 
recovers the definition (16.208) of the Riemann curvature R in terms of the Lorentz connection I, Cartan’s 
second equation of structure. In equation (16.397), the scalar 2-form covariant derivative operator SD acting 
on the multivector p-form a = ay d?zx"! is, from equations (16.388) and (16.389), the (p + 2)-form 


se 1)?(p+2 
SDa = PAV OTA SH Disa Cer. (16.399) 


implicitly summed over distinct antisymmetric sequences KAII of p + 2 indices. 


Ra = (D,D, + D,S$)a = 
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16.17.5 Bianchi identities 
The Jacobi identity for the covariant exterior derivative is 
D(DD) — (DD)D=0. (16.400) 
Applied to an arbitrary multivector form a, the Jacobi identity (16.400) implies 
0 = D(DD)a — (DD)Da = D(Ê + SD)a — (È + SD)Da = (DÊ — SR)a+ (DS — SS — R)Da. (16.401) 


Equation (16.401) holds for arbitrary a, so the coefficients of a and Da must vanish, implying the Bianchi 
identities 

DS—S$S+ife,R)=0, (16.402a) 

DR- ŜR=0, (16.402b) 


where S is the vector 2-form torsion (16.209) with $N?(N—1) components, and R is the bivector 2-form 
Riemann curvature (16.205) with (2N(N-1)) components. These equations (16.402) were given previ- 
ously in component form by equations (11.68) and (11.90). Equation (16.402a) is a vector 3-form, with 
—N?(N—1)(N—2) components, while equation (16.402b) is a bivector 3-form, with 4, N?(N—1)?(N—2) 
components. Equivalently, in terms of the exterior derivative d instead of the covariant exterior derivative 
D, the Bianchi identities (16.402) are 


dS+4(0,S]+4e-R=0, (16.403a) 
dR+35[P,R)=0. (16.403b) 


16.17.6 Interpretation of the Bianchi identities 


The torsion Bianchi identity (16.403a) looks like a covariant conservation equation for torsion S, except that 
there is a source term e - R, a vector 3-form whose 3N?(N—1)(N—2) components are 


e-R= tje, R] = Rikàu]n y” cue? (16.404) 


Since torsion S is determined completely by its equation of motion (16.370a) in terms of the spin angular- 
momentum X, the torsion Bianchi identity (16.403a) can be interpreted as determining e- R in terms of 
the torsion and its derivatives. I thank Fred Hehl for pointing out (2017, private communication) that the 
e- R term can be interpreted as the covariant exterior derivative of orbital angular momentum, §19(c) of 
Corson (1953), so that the Bianchi identity (16.403a) can be interpreted as enforcing conservation of total 
angular momentum, spin plus orbital. If torsion S vanishes, or more generally if it satisfies the covariant 
conservation equation dS + (I, S] = 0, then e- R = 0. The remaining (EN(N-1))” — 4N?(N-1)(N-2) = 
75(N+1)N?(N-—1) components of the Riemann tensor constitute its torsion-free part R, equation (15.49). 

The Riemann Bianchi identity (16.403b) looks like a covariant conservation equation for the Riemann 
tensor R. In contrast to the torsion S, the Riemann tensor R is not determined completely by its equation 
of motion in terms of the matter energy-momentum T. Rather, the equation of motion (16.363b) determines 
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only the contracted part eN? A^ R of the Riemann tensor, the double dual of the vector 1-form Einstein 
tensor. The Riemann Bianchi identity (16.403b) has 4,.N?(N—1)?(N—2) components. Of these components, 
3, N?(N-1)(N—2)(N—3) provide an equation for d(e- R), which are differential constraints on the antisym- 
metric part e- R of the Riemann tensor, a further N components provide an equation for d(e%~? A R), which 
are constraints on the Einstein tensor, and the remaining 3(N+2)N(N—3)(N?—N-+4) components provide 
Maxwell-like differential equations on the torsion-free part of the Riemann tensor, as discussed previously 
in §3.2. These Maxwell-like equations govern the behaviour of gravitational waves, which are encoded in 
the torsion-free part of the Riemann tensor that is not determined by the equations of motion, namely the 
75 (N+2)(N+1)N(N—3)-component torsion-free Weyl tensor. The torsion-free Weyl tensor is subject to a 
7g N?(N-1)?(N—3)-component bivector 4-form conservation law, 


D,(D;R) = (D-D,)R = 3[R, R] =0, (16.405) 


the last step of which follows from equation (16.198b). Equation (16.405) represents conservation of the Weyl 
current, equation (3.13). 


16.17.7 Contracted Bianchi identities 


The equations of motion (16.363) for torsion and curvature are sourced by the spin angular-momentum and 
matter energy-momentum. The Bianchi identities (16.403) on the other hand are independent of matter 
sources. The Bianchi identities impose differential constraints on the equations of motion that must be 
satisfied regardless of the form of the spin angular-momentum and matter energy-momentum. 

The equations of motion (16.363) are equations for e73 A S and e~? A^ R. Differential constraints on 
these combinations are obtained by contracting the Bianchi identities (16.403) by pre-multiplying by e~’ A. 
The contracted Bianchi identities for torsion and Riemann curvature constitute respectively a pseudobivector 
N-form with 4N(N-1) components, and a pseudovector N-form with N components, 


d(eN-3 S) ih LE, e? A S] + (=e x (eA R) = 0 š (16.406a) 
d(e%-$ A R) + 4IL, eA R] + (-) Yenta SAR =O. (16,4066) 


The final term in the contracted torsion Bianchi identity (16.406a) is a pseudobivector N-form whose com- 
ponents constitute the antisymmetric part Ruy] of the Ricci tensor, 


N-2 


e- (e~ ^ R) =e’ 3 A(e- R) = 5 


[eN =? R] = —enn---CrnRyy YP A...Ay dN eA , (16.407) 


implicitly summed as usual over distinct antisymmetric sequences k...l and «K...Auv of indices. 
Combining the contracted Bianchi identity (16.406b) with the torsion Bianchi identity (16.403a) yields 
the pseudovector N-form identity for the curvature II defined by equation (16.369), 


dI + 3(V, 1] + (-)* e 4 ((e- R)AT — ¿SAT r])=0. (16.408) 


16.17 Bianchi identities in multivector forms notation 487 


16.17.8 Interpretation of the contracted Bianchi identities 


The contracted torsion Bianchi identity (16.406a) is the 4 N(N—1)-component conservation law associated 
with invariance of the gravitational Lagrangian under Lorentz transformations. The contracted Riemann 
identity (16.406b), or equivalently (16.408), is the N-component conservation law associated with invariance 
of the gravitational Lagrangian under coordinate transformations. 

The contracted torsion Bianchi identity (16.406a) enforces continued satisfaction of the Gaussian con- 
straint (16.310a). The contracted Riemann Bianchi identity (16.406b) enforces continued satisfaction of the 
Hamiltonian constraint (16.310b). 
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Conventional Hamiltonian (3+1) approach 


In the previous Chapter, gravitational equations of motion were derived from the Hilbert Lagrangian in a fully 
covariant fashion, and the (super-)Hamiltonian form of the Hilbert Lagrangian was emphasized. The present 
Chapter explores a more traditional non-covariant 3+1 approach, in which the spacetime is foliated into 
hypersurfaces of constant time t, and the system of Einstein (and other) equations is evolved by integrating 
from one spacelike hypersurface of constant time to the next. 

The traditional 3+1 formalism is called the Arnowitt-Deser-Misner (ADM) formalism, introduced 
by Arnowitt, Deser & Misner (1959; 1963). The original purpose of ADM was to cast the gravitational 
equations of motion into conventional Hamiltonian form, to facilitate quantization. The goal of quantizing 
general relativity failed, but the ADM formalism revealed fundamental insights into the structure of the 
Einstein equations (see §16.16.1 of the previous Chapter). The ADM formalism provides the backbone for 
modern codes that implement numerical general relativity. 

The ADM formalism reveals that, for vanishing torsion, the 6 physical degrees of freedom of the gravita- 
tional field can be regarded as being carried by the 6 spatial components gag of the coordinate metric. The 6 
spatial Einstein equations constitute partial differential equations of motion of second order in time t for the 
6 physical degrees of freedom. The remaining 4 degrees of freedom of the coordinate metric can be treated as 
gauge degrees of freedom, which can be chosen arbitrarily. The 4 non-spatial Einstein equations are partial 
differential equations of first order in time t, and they are not equations of motion, but rather constraint 
equations, which must be arranged to be satisfied in the initial conditions (on the initial hypersurface of 
constant time t), but which are guaranteed thereafter by the contracted Bianchi identities, which enforce 
conservation of energy-momentum. 

The mere fact that the 6 spatial components gag of the coordinate metric can (if torsion vanishes) be 
taken to be the 6 gravitational physical degrees of freedom, and that the remaining 4 degrees of freedom of 
the coordinate metric can be treated as gauge degrees of freedom, does not mean that these choices must be 
made. Gauge choices other than ADM can be made, and are often preferred. In cosmology for example, the 
preferred gauge choice is conformal Newtonian (Copernican) gauge, §29.8, in which only 3 of the 6 physical 
perturbations are part of the spatial coordinate metric gag (the scalar ® and the 2 components of the tensor 
hap), while the remaining 3 physical perturbations are part of the time components g and gta of the metric 
(the scalar Y and the 2 components of the vector Wa). 
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Numerical experiments during the 1990s established that the ADM equations are numerically unstable. The 
community of numerical relativists engaged in an intensive effort to understand the cause of the instability, 
and to find a numerically stable formalism. The challenge problem was to compute reliably the evolution of 
the merger of a pair of black holes, and to calculate the general relativistic radiation produced as a result. The 
effort was rewarded in 2005-6 when a number of groups (Pretorius, 2005a; Pretorius, 2006; Baker et al., 2006b; 
Baker et al., 2006a; Campanelli et al., 2006; Campanelli, Lousto, and Zlochower, 2006; Diener et al., 2006; 
Sopuerta, Sperhake, and Laguna, 2006) reported successful evolution of a binary black hole (or black hole plus 
neutron star) merger. The most popular formalism for long-term evolution of spacetimes is the Baumgarte- 
Shapiro-Shibata-Nakamura (BSSN) formalism (Shibata and Nakamura, 1995; Baumgarte and Shapiro, 
1998). 

This Chapter starts with an exposition of the ADM formalism, §17.1. It goes on to apply the ADM 
formalism to Bianchi spacetimes, $17.4, which provide a fine example of the application of the formalism 
in a non-trivial case. The gravitational collapse of Bianchi spacetimes reveals that collapse to a singularity 
can show a complicated oscillatory behaviour called Belinskii-Khalatnikov-Lifshitz (BKL) oscillations 
(Belinskii, Khalatnikov, and Lifshitz, 1970; Belinskii and Khalatnikov, 1971; Belinskii, Khalatnikov, and 
Lifshitz, 1972; Belinskii, Khalatnikov, and Lifshitz, 1982; Belinski, 2014), §17.6. The Chapter concludes with 
an exposition of the BSSN formalism, §17.8, and the elegant 4-dimensional version of it proposed by Pretorius 
(2005), §17.9. 

In this Chapter, torsion is assumed to vanish. 


17.1 ADM formalism 


The ADM formalism splits the spacetime coordinates x” into a time coordinate t and spatial coordinates 
eC” a 23: 


r” = {t, x°}. (17.1) 


At each point of spacetime, the spacelike hypersurface of constant time t has a unique future-pointing unit 
normal yo, defined to have unit length and to be orthogonal to the spatial tangent axes ex, 


Yo: %o=-1, Y'ea=0 a=1,2,3. (17.2) 


The central idea of the ADM approach is to work in a tetrad frame Ym consisting of this time axis o, 
together with three spatial tetrad axes Ya, also called the triad, that are orthogonal to the tetrad time axis 
‘Yo, and therefore lie in the 3D spatial hypersurface of constant time, 


Yo:%a =O a=1,2,3. (17.3) 


The tetrad metric Ymn in the ADM formalism is thus 


—1 0 
j ( l (17.4) 
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and the inverse tetrad metric y™” is correspondingly 


mn _{ —1 0 
PRS a Lak (17.5) 


whose spatial part 7%? is the inverse of Yab. Given the conditions (17.2) and (17.3), the vierbein e”, and 
inverse vierbein €m” take the form 


m _ a 0 H— 1/a Blo 
E p= ( —e% 48° eta ) ’ Em = ( 0 Ca” ? (17.6) 


where a and 8° are the lapse and shift (see next paragraph), and e*,, and ea“ represent the spatial vierbein 
and inverse vierbein, which are inverse to each other, e*,e,° = ôf. As can be read off from equations (17.6), 
the following off-diagonal time-space components of the vierbein and its inverse vanish, as a direct conse- 
quence of the ADM gauge choices (17.2), 


eo, =e, =0. (17.7) 
The ADM line-element is 
ds? = — a? dt” + gag (dx® — B°dt) (dx? — B° dt) , (17.8) 
where gag is the spatial coordinate metric 
Jap = Yabe ag . (17.9) 


Essentially all the tetrad formalism developed in Chapter 11 carries through, subject only to the condi- 
tions (17.2) and (17.3). As usual in the tetrad formalism, coordinate indices are lowered and raised with the 
coordinate metric, tetrad indices are lowered and raised with the tetrad metric, and coordinate and tetrad 
indices can be transformed to each other with the vierbein and its inverse. 

The vierbein coefficient a is called the lapse, while 8%“ is called the shift. Physically, the lapse a is the 
rate at which the proper time 7 of the tetrad rest frame elapses per unit coordinate time t, while the shift 6° 
is the velocity at which the tetrad rest frame moves through the spatial coordinates x“ per unit coordinate 
time t, 

dt a dr” 
a=’ B S (17.10) 
These relations (17.10) follow from the fact that the 4-velocity in the tetrad rest frame is by definition 
u™ = {1,0,0,0}, so the coordinate 4-velocity u” = em” u™ of the tetrad rest frame is 


1 
—— =u" = eo” = —{1, 8°}. (17.11) 
a 


The proper time derivative d/dr in the tetrad rest frame is just equal to the directed derivative 0p in the 
time direction ‘Yo, 


ery ae. (17.12) 
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Coordinate and tetrad derivatives 0/Ox" and m are related to each other as usual by the vierbein and 
its inverse, 


g a 1/2 ga 0 
at = aðo = B Oa ; Oo = a (5 + B aa | ; (17.13a) 
a _ 9 
Axe eaba, Oa = ba Iza ’ (17.13b) 


where 8% = e*,8°. By construction, the only coordinate derivative involving the directed time derivative 
Op is the coordinate time derivative 0/Ot, and conversely the only directed derivative involving a coordinate 
time derivative 0/Ot is the directed time derivative Op. 


Concept question 17.1. Does Nature pick out a preferred foliation of time? In the ADM formal- 
ism, spacetime must be foliated into spacelike hypersurfaces of constant time, but the choice of foliation can 
be made arbitrarily. Does Nature pick out any particular foliation? Answer. Yes, apparently. The Cosmic 
Microwave Background defines a preferred frame of reference in cosmology. More precisely, the preferred 
cosmological frame is defined by conformal Newtonian (Copernican) gauge, §29.8, which is that gauge for 
which the retained gravitational perturbations are precisely the physical perturbations. What caused the 
preferred frame to be established is mysterious, but it must have happened during or before early infla- 
tion, when the different parts of what became our Universe were in causal contact. Interestingly, conformal 
Newtonian gauge does not conform to ADM gauge choices: in conformal Newtonian gauge, only 3 of the 6 
physical perturbations (® and hab) are part of the spatial metric, while the remaining 3 physical perturba- 
tions (Y and W,) are part of the lapse and shift. Conformal Newtonian gauge holds as long as gravitational 
perturbations are weak, which is true even in highly non-linear collapsed systems such as galaxies and solar 
systems. Conformal Newtonian gauge breaks down in strongly gravitating systems such as black holes. 


17.1.1 Traditional ADM approach 


The traditional ADM approach sets the spatial tetrad axes Ya equal to the spatial coordinate tangent axes 
€a, 


Ya =c ea (traditional ADM) , (17.14) 


equivalent to choosing the spatial vierbein to be the unit matrix, ea% = ô% . It is natural however to extend the 
ADM approach into a full tetrad approach, allowing the spatial tetrad axes Ya to be chosen more generally, 
subject only to the condition (17.3) that they be orthogonal to the tetrad time axis, and therefore lie in the 
hypersurface of constant time t. For example, the spatial tetrad Ya can be chosen to form 3D orthonormal 
axes, Yab = Ya ` Yb = Sad, SO that the full 4D tetrad metric yy, is Minkowski. 

This Chapter follows the full tetrad approach to the ADM formalism, but all the results hold for the 
traditional case where the spatial tetrad axes are set equal to the coordinate spatial axes, equation (17.14). 

Bianchi spacetimes, discussed in §17.4, provide an illustrative example of the application of the ADM 
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formalism to a case where it is advantageous to choose the tetrad to be neither orthonormal nor equal to 
the coordinate tangent axes. 


17.1.2 Spatial vectors and tensors 


Since the tetrad time axis ‘Yo in the ADM formalism is defined uniquely by the choice of hypersurfaces of 
constant time t, there is no freedom of tetrad transformations of the time axis distinct from temporal coor- 
dinate transformations (no distinct freedom of Lorentz boosts). However, there is still freedom of coordinate 
transformations of the spatial coordinate axes ea, and tetrad transformations of the spatial tetrad axes Yq 
(spatial rotations). 

A covariant spatial coordinate vector A, is defined to be a vector that transforms like the spatial 
coordinate axes e,. Likewise a covariant spatial tetrad vector A, is defined to be a vector that transforms 
like the spatial tetrad axes Ya. The usual apparatus of vectors and tensors carries through. For spatial 
tensors, coordinate and tetrad spatial indices are lowered and raised with respectively the spatial coordinate 
and tetrad 3-metrics gag and Yab and their inverses, and spatial indices are transformed between coordinate 
and tetrad frames with the spatial vierbein ea and its inverse. 


17.1.3 ADM gravitational coordinates and momenta 


The ADM formalism follows the conventional Hamiltonian approach of regarding the velocities of the fields 
as being their time derivatives 0/0t (as opposed to their 4-gradients 0/0x"), and the momenta as derivatives 
of the Lagrangian with respect to these velocities. 

If the Lorentz connections [mna are taken to be the coordinates of the gravitational field, then the 
corresponding conjugate momenta are, equation (16.89) with the factor 8r replaced by 16ra for convenience, 


ôL 


1 eae - 
ae SOE nna /Ot) 


Sema — e™^e™) » (17.15) 
But ADM imposes e* = 0, equations (17.6), so for the momentum to be non-vanishing, one of m or n, say 
n, must be the tetrad time index 0. Since the momentum is antisymmetric in mn, the other tetrad index m 
must be a spatial tetrad index a. Moreover since the momentum is antisymmetric in tA, the coordinate index 


Aà must be a spatial coordinate index a. Finally, with e% = —1/a, the non-vanishing momenta conjugate to 
the Lorentz connections are 
ôL 
init. aS 17.16 
5(OP aoa /Ot) ( ) 


This shows that the coordinates T'mna with non-vanishing conjugate momenta are laoa with middle (or first) 
index the tetrad time index 0 and the other two indices spatial, and that the momenta conjugate to these 
coordinates are the spatial vierbein e*”. 

If on the other hand the vierbein e”* are taken to be the coordinates of the gravitational field, then the 
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corresponding canonically conjugate momenta are, with a factor of 87a thrown in for convenience, 


OL, on 
Sra em Jt) = ae Tnm)d ; (17.17) 
where 1pm) are related to the Lorentz connections Py.) by equation (16.114). But again ADM imposes 
e% = 0, so the tetrad index m must be the time tetrad index 0. Since tym is antisymmetric in its first two 
indices, the tetrad index n must be a spatial tetrad index a. And then the non-vanishing of the coordinate 
e™* = e™ requires that À also be a spatial coordinate index 8. Thus the non-vanishing momenta conjugate 
to the vierbein coordinates are 


ôL’ 
g 
8ra 5 


a(e% At) = —Ta0s ; (17.18) 


in which the momenta Taog are related to the Lorentz connections Taog by, from equations (16.114) with 
€08 = 0, 


Taog = Daop — Casl Gc, Tao = Taop — 306M - (17.19) 


This shows that the coordinates e”* with non-vanishing conjugate momenta are the spatial vierbein e*”, 
and that the momenta conjugate to these coordinates are 7,03 with middle (or first) index the tetrad time 
index 0 and the other two indices spatial. 

As remarked before equation (16.116), the same equations of motion are obtained whether the action is 
varied with respect to either 7403 or Taos, so one can choose either maos or Taos as the momentum variables 
conjugate to the coordinates e*°. The original choice of Arnowitt, Deser, and Misner (1963) was tao, but 
equations using laog were proposed by Smarr and York (1978) and York (1979). 

A reminder: do not confuse the Lorentz connections [,,) (of which there are 24) with the coordinate 
connections Iy) (of which there are 40, for vanishing torsion). The Lorentz connections I',,,,, with final index 
a coordinate index À are related to the Lorentz connections lmn: with all tetrad indices by, equation (15.20), 


DPmnd = Ol : (17.20) 


17.1.4 ADM acceleration and extrinsic curvature 


In the previous subsection 17.1.3 it was found that, given the choice (17.6) of ADM vierbein, the momentum 
variables that emerge naturally are the Lorentz connections I'49, whose middle (or first) index is the tetrad 
time index 0, and whose other two indices ab are both spatial indices. This set of Lorentz connections is called 
the extrinsic curvature, commonly denoted Ka». As will be shown momentarily, the extrinsic curvature 
Ka is a spatial tetrad tensor. The other set of Lorentz connections that transforms like a spatial tensor are 
the connections I'g99, which are called the acceleration Ka. The combined set of connections with middle 
index 0 is called the generalized extrinsic curvature Kmo: = Imo. The non-vanishing components of 
the generalized extrinsic curvature constitute the acceleration and the extrinsic curvature (the remaining 
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components vanish, Tooo = To0a = 0): 


Ka = Kaoo = laoo = Ya‘ Ooo a Spatial vector , (17.21a) 


Kav = Kaob = Laob = Ya ` Ovo a Spatial tensor . (17.21b) 


The acceleration Ka and the extrinsic curvature Ką, form a spatial vector and tensor because the time 
axis Yo is a spatial scalar, so its derivatives oyo and Oyo constitute respectively a spatial scalar and a 
spatial vector. The vanishing of the ADM tetrad metric yoa with one time index 0 and one spatial index a, 
equation (17.4), implies that the generalized extrinsic curvature is antisymmetric in its first two indices, 


Koa = —Kaoi , (17.22) 


which remains true even in the traditional ADM case, equation (17.14), where the spatial tetrad metric Yab 
is not constant. The unique non-vanishing contraction of the generalized extrinsic curvature is 


Ky = Kim = {Kon Kin} = {Ko Ka}, (17.23) 
whose space part is the acceleration Ka, and whose time part is the trace K of the extrinsic curvature Kab, 
Ko =K= Ke. (17.24) 


The acceleration Ka is justly named because the geodesic equation shows that its contravariant components 
K” constitute the acceleration experienced in the tetrad rest frame, where u™ = {1,0,0,0}, 


Du! 

Dr 
The extrinsic curvature Kap describes how the unit normal yo to the 3-dimensional spatial hypersurface of 
constant time changes over the hypersurface, and can therefore be regarded as embodying the curvature of 
the 3-dimensional spatial hypersurface embedded in the 4-dimensional spacetime. 


Momenta Tap analogous to those defined by equations (17.19) are related to the extrinsic curvatures Kap 
by 


= u"d,u* +S Uu” = Ke, = K*, (17.25) 


Tab = Kab = Yab K > Kap = Tab — $YabT ; (17.26) 


where 7 = m = —2K is the trace of Tab- 


17.1.5 Decomposition of connections and curvatures 


As seen in the previous subsection 17.1.4, the Lorentz connections decompose into a part, the generalized 
extrinsic curvature Kmo: = Tmo; with middle (or first) index the tetrad time index 0, that transforms like a 
tensor under under spatial tetrad transformations, and a remainder, the restricted connections ven = [av 
with first two indices ab spatial, that does not transform like a spatial tensor, 


Pnn = Pnn a Kmnl : (17.27) 


Although the acceleration and extrinsic curvature arise in the first instance as Lorentz connections, for 
which the tetrad metric Ymn is constant, it is useful to allow a more general situation in which the spatial 
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tetrad metric Yab is arbitrary. Whereas Kmnı is necessarily antisymmetric in its first two indices mn, equa- 
tion (17.22), the restricted connection Îmnı need not be (it is antisymmetric in its first two indices if the 
spatial tetrad metric Yap is constant, but not for example in the traditional ADM case (17.14) where the 
spatial tetrad metric equals the spatial coordinate metric). The vanishing components of i and Kmnı are 


Poot =Toa=0, Kaw =0. (17.28) 


As a result of the decomposition (17.27) of the connections, the Riemann curvature tensor Rkimn decomposes 
into a restricted part Brunn, and a part that depends on the generalized extrinsic curvature Kmni- 

Rather than specializing immediately to the ADM case, consider the more general situation in which, 
under some restricted subgroup of tetrad transformations, the tetrad-frame connections Imn; decompose 
as equation (17.27) into a non-tensorial part — and a tensorial part Kmnı- The resemblance of the 
decomposition (17.27) to the split (11.55) between the torsion-free and contortion parts of the tetrad-frame 
connection is deliberate: in both cases, the tetrad-frame connection Tmn; is decomposed into non-tensorial 
and tensorial parts. The resulting decomposition of the Riemann curvature tensor is consequently quite 
similar in the two cases. However, here Kmnı is not the contortion, but rather some part of the tetrad-frame 
connections that is tensorial under the restricted group of tetrad transformations. 

The unique non-vanishing contraction of the tensor Kmnı is the vector 


K, = K}. (17.29) 


The placement of indices in equation (17.29) follows the usual convention for general relativistic connections, 
that Kk, = 9" Kmnı. 

The restricted tetrad-frame derivative D, with restricted tetrad-frame connection Ca is a covariant 
derivative with respect to the restricted group of tetrad transformations. Since the generalized extrinsic 
curvature Kmnı is a tensor with respected to the restricted group, its restricted covariant derivative is also 
a restricted tensor. Among other things, this implies that the restricted covariant derivatives Dy of the 
vanishing components (17.28) of Kmnı vanish identically. 

The tetrad metric ym commutes by construction with the total covariant derivative Dz, and it also 
commutes (even when the tetrad metric is not constant) with the restricted covariant derivative Dg, as 
follows from 


0= Drevin = Dyyim z Kik Ynm i Kinkin = Dyyim =. Komik = Kimk = Divi , (17.30) 


the last step of which is a consequence of the antisymmetry of the extrinsic curvature in its first two indices. 
Therefore tensors involving the restricted covariant derivative can be contracted in the usual way. 

In ADM, the extrinsic curvature is tensorial not only with respect to spatial tetrad transformations, but 
also with respect to spatial coordinate transformations. In this case, the restricted covariant derivative D; 
commutes not only with the tetrad metric, equation (17.30), but also with the vierbein e™, and its inverse 
Cm a 


0 = Dre”, = Dye™, + Kiye” p — Kipu = Dre” p + KM — K™ = Dye™, « (17.31) 


Therefore, provided that the extrinsic curvature is tensorial with respect to both coordinate and tetrad spatial 
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transformations, tensors involving the restricted covariant derivative can be flipped between coordinate and 
tetrad indices in the usual way. 

The tetrad-frame Riemann tensor Rzimn decomposes into a restricted part Ria and a remainder that 
depends on the generalized extrinsic curvature Kmnı and its restricted covariant derivatives. The derivation 
of the decomposition of the Riemann tensor is most elegant in terms of the multivector Riemann tensor Rsa 
given by equation (15.25). The decomposition of the Riemann tensor into restricted and extrinsic curvature 
parts is, analogously to the decomposition (15.49) of the Riemann tensor into torsion-free and contortion 


parts, 
ala +K) AG. +Kn) am a 
d= pei Ky +K 
Raa T Aah s(t. + a+ Ky] 
= Ra + Ô,Ky — D,K, +4(K., Ky], (17.32) 


where K, = = Kmnk y” AY” is the generalized extrinsic curvature vector of bivectors. The restricted Rie- 
mann tensor Rs is 

> OF, af, 

rà Ba" ðr 


In components, the tetrad-frame Riemann tensor decomposes as 


+ HÊ., Ty] . (17.33) 


Rkimn = Reumn Eg D; Kmni = Di Kimnk + K? Kpnk z K? pKpni F (Ki E Kip )Kmnp G (17.34) 


ml 


The restricted Riemann tensor Rein is 
Ritmn = HV mnt E OL a T D2 Tynk T Lg ar dae ~~ ie) Enip . (17.35) 


Equation (17.35) looks like the usual tetrad-frame formula (11.61), with connections replaced by restricted 
connections, except that the final term on the right hand side involves the difference T}, — T}, of the full 
tetrad-frame connection, not just the restricted connection. The part of the Riemann tensor (17.34) that 
depends on the generalized extrinsic curvature is manifestly antisymmetric in kl and in mn, but it is not 
necessarily symmetric under kl 4+ mn. Thus the restricted Riemann tensor Fins is antisymmetric in kl 
and in mn, but not necessarily symmetric under kl & mn. 
Contracting the Riemann tensor (17.34) gives the Ricci tensor Rkm, 
Rum = Rim — Dk Km + Dn Ky — KE, Ke, +K? Kp, (17.36) 


mp 

with Rin = y” Êkimn the restricted Ricci tensor. Contracting the Ricci tensor (17.36) yields the Ricci 
scalar R, 

R= R — 2Dm K” — KK" Kamp = K’ K, , (17.37) 


with R= ohm Rian the restricted Ricci scalar. 
A restricted covariant divergence D,,A™ can be converted to a total covariant divergence D,,,A™ through 


Dm A” =), A” + K™ 


pm 


AP = Dm A” + K,A? . (17.38) 
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With the restricted covariant divergence converted to a total covariant divergence, the Ricci scalar (17.37) is 


R= R= 2Dm K” WOO, nad K” K, . (17.39) 


17.1.6 ADM Riemann and Ricci tensors 


For ADM, the components of the Riemann curvature tensor Rkimn are, from equations (17.34) with the 
generalized extrinsic curvature Kmnı replaced by the acceleration Ka = Kaoo and the extrinsic curvature 
Kap = Kaob, 

Raoco = Da Ke a Do Kea =F Kak. = KK ; 


(17.40a 
Raveo = DaKes — Dy Kea — (Koa — Kav) Ke (17.40b 
R (17.40c 

( 


17.40d 


= Reoab = Reoab + Ky Kac = Ka Koc , 


) 
) 
) 
Raveda = Ravea + Kea Kav — Kev Kaa - ) 
Equations (17.40a), (17.40b), (17.40c), and (17.40d) are called respectively the Ricci, Codazzi-Mainardi, 
BSSN, and Gauss equations. After equations of motion have been obtained, the extrinsic curvature Kap 


0, = 0, and assuming vanishing torsion), 


will prove to be symmetric (given the ADM gauge condition e 
and consequently the final term on the right hand side of equation (17.40b) vanishes. At this point however 
no equations of motion have yet been obtained: equations are obtained later, §17.2, from variation of the 
action. If torsion vanishes, then the Riemann tensor Rkimn is Symmetric in kl + mn, Exercise 11.6. If 
the tetrad connections are replaced by their usual torsion-free expressions in terms of derivatives of the 
vierbein, then the symmetries of the Riemann tensor are satisfied identically, so that the right hand sides 
of the expressions (17.40b) and (17.40c) for Rabco and Reoab become identical, and one of them can be 
discarded. In the ADM formalism, equation (17.40c) for R-oap is discarded as redundant. However, in the 
BSSN formalism, §17.8, equation (17.40c) is retained as a distinct equation, and some of the equations 
relating the tetrad connections to derivatives of the vierbein are discarded instead. 

The restricted Riemann tensor Piran with one of the final two indices the time index 0 vanishes since Tsai 
vanishes, equations (17.28), 


Rang = Onl Gi — Oil woe + rel sae — T? Tyo + ike — T? Pane =0. (17.41) 


The restricted Riemann tensor Rima with one time 0 index does not satisfy the kl 4+ mn symmetry of the 
full Riemann tensor Rkimn. The restricted Riemann tensor Reoab with one of the first two indices the time 
index 0 and the last two indices spatial is 


Rooab = ô ato mr Aol abe F Pl Eae = C2 Êo F A _ Ram iene : (17.42) 
The restricted Riemann tensor Ravca with all spatial indices is 
er = dal cab -= b eda J2 Pe, Peda a ÎE Ledo T (rs, = reo + (Kab = Koa)l cao . (17.43) 


Again, after equations of motion have been obtained, the extrinsic curvature Kap will proved to be symmetric, 
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equation (17.50), so the final term on the right hand side of equation (17.43) vanishes. Consequently the 
spatial restricted Riemann tensor Rees depends only on spatial components ie of the restricted connections 
and their spatial derivatives (not on the restricted connections Papo with one time index). For ADM, the 
restricted spatial connections coincide with the full spatial connections, ae = Pabe, so for ADM the spatial 
restricted Riemann curvature equals the Riemann curvature tensor restricted to the 3-dimensional spatial 
hypersurface of constant time. The spatial restricted Riemann tensor Rabcd satisfies the usual ab © cd 
symmetry. 
Contracting the Riemann tensor yields the Ricci tensor Rkm, 


Roo = DmK” — K Ka + K*Ka , 
Rao = — Da K + D’ Koa — K? (Kas — Koa) 
= Roa = Roa — K’Kay + KAK , 
Ras = Ros — Da Ko + Do Kia + Koa K — Ka Ko . 


17.44a 
17.44b 
17.44c 


( 
( 
( 
(17.44d 


) 
) 
) 
) 
If torsion vanishes, then the Ricci tensor Rkm is symmetric. Again, if the tetrad connections are replaced 
by their torsion-free expressions in terms of vierbein derivatives, then the symmetry of the Ricci tensor is 
satisfied identically, so that two expressions (17.44b) and (17.44c) are identical, and one of them can be 
discarded as redundant. In the ADM formalism, equation (17.44c) is discarded. In the BSSN formalism 
however, §17.8, equation (17.44c) is retained, and some of the equations relating the tetrad connections to 
derivatives of the vierbein are discarded instead. Like the restricted Riemann tensor, the restricted Ricci 
tensor Rem with one time 0 index is not symmetric. While R ao vanishes, Roa does not. The purely spatial 
Ricci tensor Êa» is on the other hand symmetric in ab. For ADM, the purely spatial Ricci tensor Rap is the 


Ricci tensor restricted to the 3-dimensional spatial hypersurface of constant time. 
Contracting the Ricci tensor yields the Ricci scalar R, 


R=R-2D,K™ + K” Ka + K? —2K*K, . (17.45) 


For ADM, the restricted Ricci scalar R is the Ricci scalar restricted to the 3-dimensional spatial hypersurface 
of constant time. 

Converting the restricted covariant divergence D,,K™ to a total covariant derivative Dm K™ using equa- 
tion (17.38) brings the Ricci scalar to 


R=R-2D,K" + K” Ka- K?. (17.46) 


At this point it is common to argue that the covariant divergence Dm K™ has no effect on equations of 
motion, so can be dropped from the Ricci scalar, yielding the so-called ADM Lagrangian 
Lamm = y= (R +K” Ka — K?) l (17.47) 
The ADM Lagrangian (17.47) is fine as a Lagrangian, but it is not in Hamiltonian form. Rather, the ADM 
Lagrangian (17.47) is in a form analogous to the quadratic Lagrangian (16.159). As discussed in §16.12.1, 
the quadratic Lagrangian is valid provided that the tetrad connections satisfy their equations of motion (in 
particle physics jargon, the tetrad connections are “on shell”). 
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The original purpose of the ADM formalism was to bring the gravitational Lagrangian into (conventional) 
Hamiltonian form. As has been seen, §16.7, the Hilbert Lagrangian is already in (super-)Hamiltonian form. In 
dropping the covariant divergence D,,.K™ to arrive at the Lagrangian (17.47), one both implicitly assumes 
that the equation of motion for K™ is satisfied, and loses the ability to derive that equation of motion 
from the Lagrangian. One could attempt to recover the Hamiltonian form of the Lagrangian from the ADM 
Lagrangian (17.47), which would involve re-assuming the equation of motion for K™, but such a procedure 
(widely repeated in the physics literature) seems like shooting oneself in the foot. A sensible approach is to 
stick with the Hilbert Lagrangian, which is already in (super-) Hamiltonian form. The (super-)Hamiltonian 
approach has already identified the gravitational coordinates and momenta for ADM, §17.1.3, and it also 
supplies the equations of motion for ADM, 817.2. 


17.2 ADM gravitational equations of motion 


As shown in §17.1.3, the gravitational coordinates and momenta in the ADM formalism are the spatial 
components e°’ of the vierbein, and the extrinsic curvatures Kag = Taos, equation (17.21b) (or alternatively, 
in place of Kag, the trace-corrected extrinsic curvatures Tag defined by equations (17.26)). 

Gravitational equations of motion in the ADM formalism follow from varying the Hilbert action. All 
the equations obtained from varying the Hilbert action in super-Hamiltonian form continue to hold in the 
ADM formalism, namely the 24 equations for the (torsion-free) Lorentz connections, and the 10 Einstein 
equations (the Einstein tensor is symmetric if torsion vanishes). The difference is that only some of the 
equations, namely those that come from varying the action with respect to the gravitational coordinates and 
momenta ef and Kag, are interpreted as equations of motion that determine the time evolution of those 
coordinates and momenta. The remaining equations are interpreted either as identities (in the case of the 
Lorentz connections), or as constraints (in the case of the Einstein equations). A constraint equation is one 
that must be satisfied in the initial conditions, but is thereafter guaranteed to be satisfied by conservation 
laws, here conservation of energy-momentum, guaranteed by the contracted Bianchi identities. 

Because the tetrad in this Chapter is being allowed a general form, with not necessarily constant tetrad 
metric, the connections are not necessarily Lorentz connections, and the relation between the connections 
and derivatives of the vierbein and metric, equation (11.53), is more general than that derived from an action 
principle in Chapter 16. Suffice to say that the relation can be derived from an action principle, but that 
will not be done here. 


17.2.1 ADM connections 


Start by considering the equations of motion for the tetrad-frame connections, determined by varying the 
Hilbert action with respect to the connections. The connections are given by the usual expressions (11.53) in 
terms of the vierbein derivatives djmn defined by equation (11.33) (equations (11.53) allow for a non-constant 
spatial tetrad metric Yab, thus admitting the traditional ADM approach in which the spatial tetrad ya are 
set equal to the spatial coordinate tangent axes e,, equation (17.14)). The non-vanishing tetrad connections 
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are, from the general formula (11.53) with vanishing torsion (note that doan = 0 since e°, = 0), 

17.48a 
17.48b 
17.48c 
17.48d 


Paoo = — T oao = Ka = — dooa , 
Taob =m Toab = ab 7 4 (Oo°ab + dabo + dbao — daob — dioa) ry 
Davo = [0 = Kap m dabo + daob , 


NNN mS 
NS ea 


Tabe = lane = Same as eq. (11.53) , 


where the relevant vierbein derivatives dimn are 


1 1 
dooa = —— aa, daob = —— Caa B| , dabo = Yac €b” Ooe%s - (17.49) 
a a 


Equation (17.48b) shows that the extrinsic curvature is symmetric, 
Kav = Koa (17.50) 


(and consequently so also is the momentum Tab, equations (17.26)). The symmetry of the extrinsic curvature 


is a consequence of the ADM gauge choice e? 


a = 0 along with the assumption of vanishing torsion. The 
connections (17.48a) and (17.48b) form, as remarked after equations (17.21), a spatial tetrad vector the ac- 
celeration K4, and a spatial tetrad tensor the extrinsic curvature Kab, but the remaining connections (17.48c) 
and (17.48d) are not spatial tetrad tensors. Note that the purely spatial tetrad connections Pasc, like the 
spatial tetrad axes Ya, transform under temporal coordinate transformations despite the absence of tempo- 
ral indices. If the spatial tetrad metric yap is taken to be constant, which is true if for example the spatial 
tetrad axes -y, are taken to be orthonormal, then the tetrad connections Taso and Pasc, equations (17.48c) 
and (17.48d), are antisymmetric in their first two indices. However, equations (17.48) are valid in general, 
including in the traditional case where the spatial axes are taken equal to the spatial coordinate tangent 
axes, equation (17.14), in which case Pabo and Tabe are not antisymmetric in their first two indices. 

In the ADM formalism, an equation of motion for the ADM spatial coordinate metric gag follows from 
the vanishing of the restricted covariant time derivative of the spatial tetrad metric Yab, equation (17.30), 


Doyan = 0| - (17.51) 


With the expressions (17.48c) for the connections Taso, the covariant time derivative is 
Dovab = OoYab za Poe = TS ried 
= OoYab T dabo ae dao > daob = diga = 2K ap : (17.52) 


The time derivatives in expression (17.52) are the directed time derivatives Opyay of the spatial tetrad 
metric (the tetrad metric yap is not being assumed constant, so as to allow the traditional ADM approach, 
equation (17.14)), and the directed time derivatives dj, = Oge°s of the spatial vierbein. These time derivatives 
appear in the expression (17.52) only in the combination 


oYab + davo + dbao = €a €p? Oo (Yea E a C48) = ea en” Bogas - (17.53) 


Thus the equation of motion (17.51) effectively governs the time evolution of not all 9 components of the 
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spatial verbein eag, but rather only the 6 components gag of the spatial coordinate metric, equation (17.9). 
Recast in the coordinate frame, the equation of motion (17.51) is 


Jap op” op” 


O98 
+ an F + — = 2a K s 17.54 
fa) y g 8 B IBY 0 a aß ( ) 


ot 
Equation (17.54) may also be written 


Fp 


1 /3ga a 


r (17.55) 


where Lu denotes the Lie derivative (7.151) with respect to the 4-velocity u” = {1/a, 87 /a}, equation (17.11), 
and £g denotes the Lie derivative (7.151) with respect to the shift 87, restricted to the hypersurface of 
constant time (hence the restricted ^ overscript), 


O9a8 op” op” 
Ox 


Legas = p: F Joy 5B + IBY pga G (17.56) 


As is usual with a Lie derivative, equation (7.152), the coordinate derivatives 0/Ox° in equation (17.56) can be 
replaced, if desired, by the restricted covariant derivatives Da. Since the restricted covariant derivative of the 
spatial coordinate metric gag vanishes, the Lie derivative Lggag can be written (compare equation (7.154)), 


Legap = DgBa + Dapo - (17.57) 


The spatial trace of equation (17.52) provides an equation of motion for the determinant y = |Yab| of the 
spatial tetrad metric, since y% 3oYab = Oo ln y. With, from equations (17.49), 
1 1 ope 
doa = —— e”a ab“ = —— ere dlo = €a” Ooe*g = Oylne , (17.58) 
a x 


where e = |e“,| is the determinant of the spatial vierbein, the spatial trace of equation (17.52) provides the 
equation of motion 


2 
dp In(ye”) + Ae (17.59) 


In the coordinate frame, the trace equation is (see equation (7.23) for the Lie derivative of a metric deter- 
minant) 


Q 
Lulng = L (4 poo 2) = 2K , (17.60) 
where g = |gag| = ye? is the determinant of the coordinate-frame spatial metric. 

The expression (17.48b) for the extrinsic curvature Ka» has thus provided an equation of motion (17.55) for 
the spatial ADM metric gag. Of the remaining connections (17.48), the acceleration Ka, equation (17.48a), 
and the purely spatial connections Tabe, equation (17.48d), involve only spatial derivatives of the vierbein, 
not time derivatives. These connections are needed in the ADM equations, but are treated as identities rather 
than equations of motion. That is, the equation of motion (17.55) determines the time evolution of the spatial 
vierbein eag, or rather of the spatial coordinate metric gag, which is the quadratic combination (17.9) of 
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the spatial vierbein. With the spatial vierbein on a hypersurface of constant time determined, their spatial 
derivatives on the hypersurface follow. These spatial derivatives of the vierbein determine the acceleration 
Ka and purely spatial connections Tabe through equations (17.48a) and (17.48d). 

The final set of connections is Tabo, equation (17.48c). These connections do depend on time derivatives, 
and they do appear in the equations of motion (17.52) and (17.63), but they cease to appear explicitly when 
the equations of motion are expressed as equations for the coordinate metric gag and the coordinate-frame 
extrinsic curvature Kag, equations (17.55) and (17.68). 


17.2.2 ADM Einstein equations 
The Einstein equations follow from varying the Hilbert action with respect to the vierbein e*”, equa- 
tion (16.104). In the ADM formalism, only the spatial Einstein equations, which come from varying with 
respect to the spatial vierbein e°“, are interpreted as equations of motion governing the time evolution of 
the system. The remaining equations are interpreted as constraints, §17.2.3. 


The spatial Einstein equations are 


Gab = 8rT ap , (17.61) 
which are symmetric for vanishing torsion. Equivalently, with the trace R = —8rT transferred to the right 
hand side, 

Rab = 8T (Tab — 5 YabT ) . (17.62) 


Substituting the spatial Ricci tensor from equation (17.44d) transforms the spatial Einstein equations (17.62) 
into equations of motion for the extrinsic curvature Kap, 


Do Kav = DaKy — Kaw K + KaKo — Rav + 8m (Tav — 4a0T) |. (17.63) 


The restricted covariant time derivative DoKpa on the left hand side of equation (17.63) is, with for- 


mula (17.48c) for the connections Taso, 
Do Kav = 0 Kav — Po Kea — SoKo 
= 00 Kay + deoo KG + deao Kg — deoo Kg — deoa Ky — 2K GK ep . (17.64) 
The time derivatives in equation (17.64) are the directed time derivatives Op Ka» of the extrinsic curvature, 


and the directed time derivatives d%, = etg of the spatial vierbein. These time derivatives appear in the 
expression (17.64) only in a combination analogous to that in equation (17.53), 


Oo Kav + devo KE + dean KẸ = eae" OK ga - (17.65) 


Just as equation (17.53) picked out the spatial coordinate metric gag, so also equation (17.65) picks out the 
coordinate-frame extrinsic curvature Kag as the fundamental object whose time evolution is being governed. 
Recast in the coordinate frame using equation (17.65), equation (17.64) is 


z 1 [ðKag 4 
DoKap = LuKas — 2K} Kya = — (Se + Eos) ~2KK yp . (17.66) 


17.2 ADM gravitational equations of motion 503 


where again Lu denotes the Lie derivative (7.151) with respect to the 4-velocity u”, equation (17.11), and 
Lg denotes the Lie derivative (7.151) with respect to the shift 87, restricted to the hypersurface of constant 
time, 
OK OB OB 
BK OF K us 
Ox 1 Ox8 W Ax 
As usual with a Lie derivative, equation (7.151), the coordinate derivatives 0/Ox° in equation (17.67) can 
be replaced, if desired, by the restricted covariant derivatives Da. Substituting equation (17.66) into equa- 
tion (17.63) brings the equation of motion for the coordinate-frame extrinsic curvature to 


LeKop = 8 


(17.67) 


OK, aß 
ot 


1 
Ly Ka = ( 


= =F EK) — D, Ko + 2K? Ky, = Kag K + KaKo = Rag + 87 (Tag = $9a3T) 
Q 


(17.68) 
All the terms in equation (17.68) are manifestly symmetric in af except for Da Kg, but this too is symmetric, 
for vanishing torsion, as follows from 
0? Ina 
Ox° Ox? 


the coordinate connection iyi being symmetric in its last two indices, for vanishing torsion. Equations (17.55) 


DuKe = —13,Ky = DpKa , (17.69) 


and (17.68) constitute the two fundamental sets of equations of motion for the coordinates gag and momenta 
Kag in the ADM formalism. 

The spatial trace of equation (17.63) (which is straightforward to take because the tetrad metric Yab 
commutes with the restricted covariant derivative Dy) is 


OK = D,K" — K? + K*K, — Ê+ 12r(p— p), (17.70) 


where the spatial trace T? = 3p defines the proper monopole pressure p, and the full spacetime trace is 
T = — p + 3p, with p the proper energy density. In the coordinate frame, equation (17.70) becomes 


K K 
` po 


1 
Lau K = 
a 


= D,K* — K? + K” Ka —R+12n(p—p). (17.71) 
Ot Ox 


17.2.3 ADM constraint equations 


Unlike the spatial vierbein e’, the vierbein e°” with a tetrad time index 0, whose components define 
the lapse and shift, equation (17.11), have vanishing canonically conjugate momenta, as shown in §17.1.3. 
Consequently, in the ADM formalism, the lapse and shift are not considered to be part of the system of 
coordinates and momenta that encode the physical gravitational degrees of freedom. Rather, the lapse a and 
shift 6° are interpreted as gauge variables that can be chosen arbitrarily. The 4 gauge degrees of freedom in 
the lapse and shift embody the 4 gauge degrees of freedom of coordinate transformations. 

Nevertheless, varying the Hilbert action with respect to e°” does yield equations of motion, which are the 
4 Einstein equations with one tetrad time index 0, 


Gmo = 8nTmo : (17.72) 


504 Conventional Hamiltonian (3+1) approach 


Combining equations (17.44a) and (17.45) yields an expression for the time-time Einstein component Goo = 
Roo — sy00R = Roo + iR, while equation (17.44b) gives the space-time Einstein component Gao = Rao, 


Goo = L(R = K” Kpa + K?) = 1(R = T” Kra) ; (17.73a) 
Gao = D? Kya — Da K = Dempa . (17.73b) 


Whereas the spatial Einstein equations yielded time evolution equations (17.63) or (17.68) for the momenta, 
the expressions (17.73) for the time-time and space-time Einstein components involve only spatial derivatives 
of the coordinates and momenta, no time derivatives. Since the coordinates and momenta are determined 
fully by their equations of motion, equations (17.52) and (17.63), or (17.55) and (17.68), the Einstein equa- 
tions (17.72) with at least one time index cannot be independent equations. However, the equations (17.72) 
cannot be discarded completely. Rather, the Einstein equations (17.72) must be arranged to be satisfied 
in the initial conditions (on the initial hypersurface of constant time t), whereafter the Bianchi identities 
ensure that the constraints are satisfied automatically, as you will confirm in Exercise 17.2. This kind of 
equation, which must be satisfied on the initial hypersurface but is thereafter guaranteed by conservation 
laws, is called a constraint equation. In the ADM formalism, the time-time Einstein equation is called the 
energy constraint or Hamiltonian constraint, while the space-time Einstein equations are called the 
momentum constraints: 


I(B- K” Ka + K*) =87Too Hamiltonian constraint , (17.74a) 
D’? Kpa — Da K = 8nT.9 momentum constraints . (17.74b) 


Exercise 17.2. Energy and momentum constraints. Confirm the argument of this section. Suppose 
that the spatial Einstein equations are true, G®? = 877%. Show that if the time-time and space-time 
Einstein equations G™® = 8nT™® are initially true, then conservation of energy-momentum implies that 
these equations must necessarily remain true at all times. [Hint: Conservation of energy-momentum requires 
that D,T™” = 0, and the Bianchi identities require that the Einstein tensor satisfies D,G”™” = 0, so 


DG — 8T”) = 0, (17.75) 


By expanding out these equations in full, or otherwise, show that the solution satisfying G% — 87T* = 0 
at all times, and G™? — 8rT”™? = 0 initially, is G"° — 87T™° = 0 at all times.| 


17.2.4 ADM Raychaudhuri equation 


If the Hamiltonian constraint (17.74a) is used to eliminate the restricted Ricci scalar R, then the trace 
equation (17.71) becomes 


Ox” 


1 (OK OK A 
LuK = ( + Bo ) = Ô, K®° — K? Kga + K° Ka — 4T (p + 3p) |. (17.76) 
Q 
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Equation (17.76) is the Raychaudhuri equation (18.22a) with vanishing vorticity and non-vanishing acceler- 
ation Ka. 


17.3 Conformally scaled ADM 


A common modification of the ADM formalism is to separate out a spatial conformal factor a, which may 
be an arbitrary function of coordinates. 

It is neater to separate the conformal factor from the vierbein than from the tetrad metric, so that the 
tetrad metric Yap can still be allowed to be constant, as in the case of an orthonormal tetrad. If the spatial 
vierbein eĉ is factored as a product of the conformal factor a and a conformal vierbein é*,, then the vierbein 
and inverse vierbein become 


eg = alg, Ea = Ea /a. (17.77) 


The conformal vierbein and inverse conformal vierbein are inverse to each other, €°,é,° = 6;. The lapse a 
and shift @° are unchanged by the conformal scaling. The spatial conformal coordinate metric defined by 
Gap = Yab Ca eg is related to the spatial coordinate metric gag by 


Jap = a? Jag . (17.78) 


Section 17.1.5 discussed the splitting of tetrad-frame connections into a generalized extrinsic curvature 
Kimn that behaves like a tensor under some restricted group of transformations, and a restricted connection 
Tin that does not transform like a tensor. In the case of ADM, the restricted group of transformations was 
spatial transformations of the tetrad Ym (that is, transformations that leave the time axis yọ unchanged). 
The conformal factor a is a scalar with respect to the subgroup of spatial tetrad transformations that leave 
the conformal factor a unchanged. Thus all of the discussion in §17.1.5 carries through with the restricted 
group of transformations taken to be spatial transformations that preserve the conformal factor. 

The conformal decomposition of the spatial vierbein implies a corresponding conformal decomposition of 
the vierbein derivatives dj, defined by equation (11.33). The vierbein derivatives dj», with either of the 
first two indices Im the time index 0 are unaffected, but the vierbein derivatives d,»,, with first two indices 
ab spatial decompose as 


dabn = Yac ep” One a = Yac ep” €°aOn Ina + Yac Ep” On€« a. = Yab On Ina + dabn ; (17.79) 


which is a sum of a part Yab On Ina that depends on derivatives of the conformal factor a, and a conformal 
part daii that depends on derivatives of the conformal vierbein é€°,,. The part Yab On Ina is a spatial tensor 
under the restricted group of spatial transformations that leave the conformal factor a unchanged. It then 
follows that the spatial tetrad-frame connections Pabe split into a restricted part Via and a tensorial part 
Kabes 


Pabe = Tate + Kave ; (17.80) 
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where Kaye is the spatial tensor 


Kabe = Yac Op Na — Yor Og Na . (17.81) 


The acceleration Ka = Kaoo, extrinsic curvature Kab = Kaob, and restricted connections ieee with final index 
the time index 0 are unchanged by the conformal decomposition. Thus the generalized extrinsic curvature 
Kimn now consists of the acceleration Koo, the extrinsic curvature Kaos, and the derivatives Kabe of the con- 
formal factor defined by equation (17.81). The generalized extrinsic curvature Kımn remains antisymmetric 
in its first two indices, 


Kimn = —Kmin - (17.82) 


The unique non-vanishing contraction K,, of the generalized extrinsic curvature is (this repeats equa- 
tion (17.23)) 


Km = Ky, = {Kon Kin} = {Ko0, Ka} , (17.83) 


whose time part remains equal to the trace K of the extrinsic curvature Kab, but whose spatial part Ka is 
modified to equal the sum of the acceleration Kgoo and a derivative of the conformal factor, 


Ka = Kaoo + 20a na = ôa In(aa’) . (17.84) 


Unlike in ADM, Ka is not the same as the acceleration K go. 

The restricted tetrad-frame derivative D, with restricted tetrad-frame connections Py nn is a covariant 
derivative with respect to the restricted group of spatial transformations that preserve the conformal fac- 
tor a. The restricted covariant derivative D, differs from ADM only in that the restricted connections now 
exclude the part depending on derivatives of the conformal factor, which have been absorbed into the spatial 
components Kabe of the generalized extrinsic curvature. The vierbein e™,, and the tetrad metric yj, continue 
to commute with the restricted covariant derivative D;,, equations (17.30) and (17.31). All of the discussion 
and equations in §17.1.5 carry through unchanged. 

The various expressions for the Riemann and Ricci tensors given in §17.1.6 are modified to include addi- 
tional terms involving the spatial components Kabe of the generalized extrinsic curvature. In particular, the 
expressions for the Ricci tensor Ry, are modified to, from the general equation (17.36), 


Roo = — DoK + Da Ko — K” Kay + KgoKa | 
Rao == Da K + D Kpa — Kap Klo + Koa K — Koap kK” , 
= Roa = Roa — DoKi, — Koo Kav + KaooK — K" Keav , 
Rab = Rav — Da Ko + DoKoa + D° Keva + Koa K — KaooKooo + Keba K — Keg KË. . 


17.85a 
17.85b 
17.85c 


( 
( 
( 
(17.85d 


) 
) 
) 
) 


Like the time-space restricted Ricci tensor Boa, the spatial restricted Ricci tensor Rap is not symmetric in 
ab. 

The equations of motion (17.51) or (17.55) for the spatial metric gag remain unchanged by the conformal 
decomposition. The equation of motion (17.63) for the extrinsic curvature Ka, is modified in accordance 
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with the modified expression (17.85d) for the spatial Ricci tensor Rap to 
Do Kas = Da Ko — D° K ova — Kab K + KaooKooo — Keba K? + Kea KE, — Ray + 80 (Tab — 4avT) . (17.86) 


Equation (17.86) is essentially the same as the earlier equation of motion (17.63), but it redistributes terms 
involving derivatives of the conformal factor a out of the spatial restricted Ricci tensor Ras into terms 
involving Ka and Kabe. The coordinate-frame version of the equation of motion (17.68) for the extrinsic 
curvature Kag is modified similarly to 


1 (OKop ; 
Lu Kop = — (= 2 + £oKoa) (17.87) 


= Da Kg — D’Kypq + 2K1K yg — Kap K + KoooK p00 — Krpa K’ + K2;K},, — Rag +80 (Tag — $9asT) . 


Again, this equation of motion is essentially the same as the earlier equation of motion (17.68), with a 
redistribution of terms out of Rag into generalized extrinsic curvatures. 


17.4 Bianchi spacetimes 


A 3-dimensional Lie group is called a Bianchi space (Bianchi, 1898). A Lie group is a group of symmetry 
transformations that is also a differentiable manifold. Lie groups are generated by infinitesimal transforma- 
tions called the generators of the group. A 3-dimensional Lie group has 3 linearly independent generators. 
The properties of a Lie group are determined by the commutators of its generators, or equivalently by its 
structure coefficients c°,, equation (17.88), which for a Lie group are taken to be constant. A Bianchi space 
is consequently homogeneous. The assumption that a space is a Lie group is stronger than the assumption 
that the space is homogeneous, which requires merely that the tetrad-frame Riemann tensor be spatially 
constant. However, most homogeneous 3-dimensional spaces are Lie groups, hence Bianchi spaces, the no- 
table exception being the closed cylindrical geometry, equation (17.132). Bianchi spaces are homogeneous 
but not necessarily isotropic. 

Bianchi spacetimes, also known as Bianchi universes, are Bianchi spaces that evolve in time while preserving 
the posited Lie group structure. Bianchi spacetimes offer a framework for addressing possible large scale 
departures from isotropy in cosmology, and provide the prototype for the Belinskii-Khalatnikov-Lifshitz 
(BKL) (Belinskii, Khalatnikov, and Lifshitz, 1982; Belinski, 2014) model of anisotropic gravitational collapse, 
817.6. Bianchi spacetimes present a fine application of both the ADM formalism and the tetrad formalism, 
in a situation where the tetrad is neither orthonormal, nor aligned with the coordinates, nor is the tetrad 
metric constant (in time). 


17.4.1 Bianchi structure coefficients 


The assumption that a space is homogeneous requires that the space has a complete set of spacelike Killing 
vectors, thus 3 linearly independent spacelike Killing vectors in 3-dimensional space. The spatial components 
Ya of the tetrad can be chosen to coincide with the 3 Killing vectors at each point. Equivalently, the 3 Killing 
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vectors can be identified with the directed derivatives 0, along the 3 spatial tetrad axes (see §7.32). The 
commutators of the directed derivatives define the structure coefficients c&,, 


[3a Ob] = Cabe , (17.88) 


which are necessarily antisymmetric in their last two indices ab. Homogeneity does not require that the 
structure coefficients be spatially constant; rather, homogeneity requires that the tetrad-frame Riemann 
tensor be spatially constant. However, Bianchi spaces are by assumption Lie groups, for which the structure 
coefficients are spatially constant. For Bianchi spaces, the Killing vectors ô, are the generators of the Lie 
group, whose properties are determined by the structure coefficients c&,. The vector space of real linear 
combinations of the generators ôa defines a Lie algebra, with multiplication defined by equation (17.88). 

The structure coefficients must be such that Jacobi identity [Oja, [O),0.]] = 0 is satisfied. If the structure 
coefficients are spatially constant, then the Jacobi identity requires 


Cae] = 0» (17.89) 


17.4.2 Bianchi line-element 


A Bianchi spacetime is a Bianchi space that evolves in time while preserving the Lie group spatial structure. 
The spatial Bianchi line-element can be constructed out of 1-forms e*, dx° which, being aligned with the 
Killing vectors Ya, are by construction independent of the choice of spatial coordinates x“. The time coordi- 
nate t is chosen so that spatial surfaces of constant time are homogeneous. To preserve spatial homogeneity, 
the tetrad metric Yab = Ya ` Yo Must be independent of the spatial coordinates, but it may depend on time 
t. As usual in the ADM formalism, the tetrad time axis yo is chosen to be orthogonal to the spatial tetrad 
axes Ya, which lie in the surfaces of constant time. The line-element can thus be taken to be 


ds? = — dt? + Jap dx*dx® = — dt? + Yab lt) ee" 4 dx“dz? , (17.90) 


which is in ADM form with unit lapse and zero shift. The vierbein and its inverse are 


i: 1 0 1 0 
e u= ( 0 ea ) 5 Em” = ( 0 ea? ) . (17.91) 


The tetrad time derivative coincides with the coordinate time derivative, 09 = 0/0t. The condition that 
the homogeneous spatial structure be preserved in time means that the Killing vectors do not depend on 
time, [00,02] = 0, so the vierbein, and the inverse vierbein, are independent of time. However, despite spatial 
homogeneity, the spatial vierbein coefficients e'a may (and generically do) depend on the spatial coordinates, 
as they do for example in FLRW spacetimes. Likewise homogeneity allows that the structure coefficients cé, 
defined by the commutators of the directed derivatives, equation (17.88), may be functions of the spatial 
coordinates. As emphasized above, Bianchi spaces are by assumption those for which the structure coefficients 
are spatially constant, but this is not required by homogeneity. Whether or not the structure coefficients are 
spatially constant, they satisfy c$, = 2dfab]> where df, are the spatial components of the vierbein derivatives, 
equation (11.33). 


17.4 Bianchi spacetimes 509 


Table 17.1: Classification of Bianchi spaces 


Eigenvalues Type 

ny no ng k=O kÆ#Æ0 

0 0 0 I V 

0 0 + II IV 

0 + Vio VI 

0 + + VIb VII 
VIII 

+ IX 


17.4.3 Classification of Bianchi spaces 


Bianchi spaces are classified according to the invariant properties of their constant structure coefficients c. 
Choose a point of the spacetime. The structure coefficients at that point can be written in terms of a 3 x 3 
matrix n?°, which can be decomposed into symmetric n(@) and antisymmetric n!@| parts, 


Ep = Eban = Eapa(n'* + nit!) . (17.92) 


By an orthogonal rotation of axes the symmetric matrix n(¢°) can be brought to diagonal form with eigen- 
values ne, while the antisymmetric part can be written in terms of a vector ke, 


co, = Eaba(O™ ne — €*°ke) (no sum over c) . (17.93) 


The Jacobi identity (17.89) implies that 0 = e%¢c§,c@, = eaapnt n®? = 4nf°kp, which equals 4neke (no 
sum over e) in each direction e, thus 


nf°kf =neke =0 (each direction e, no sum over e) . (17.94) 


Thus in each direction, either ne or ke equals zero. If the vector ke is non-vanishing, then without loss of 
generality it can be chosen to lie along the 1-direction, ke = {k,0,0}. The real number k can be non-zero 
only if nı = 0. The commutators (17.88) of the directed derivatives 0, then reduce to (with at least one of 
nı and k zero) 


[03,02] = 710, , [01,03] = n202—kd3 , [02,01] = n303 + kôə . (17.95) 


Under a rescaling of axes ôe œx 1/ac, the eigenvalues scale as nı x a1/(aga3) and cyclically for ng and ng. 
Thus by a rescaling of axes, each of the non-zero eigenvalues ne can be scaled to any other value of the same 
sign. Flipping any axis changes the signs of all the ne, so the number of positive eigenvalues can always be 
chosen to be greater than or equal to the number of negative eigenvalues. Finally, the axes can be reordered 
arbitrarily. Thus the invariant properties of the eigenvalues ne are the numbers of negative, zero, and positive 
eigenvalues. If the parameter k is non-zero, and if ng and ng are non-zero (Bianchi Types VI and VII), then 
k cannot be rescaled independently, since k œ 1/a, x |ngn3|!/? is fixed by the scaling of nz and ng. If on the 
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other hand either of ng and ng are non-zero (Bianchi Types V and IV), then k can be rescaled independently. 
The sign of k changes under a flip of the 1-axis, so k can be taken to be positive. 

Table 17.1 lists the distinct possibilities for the 3 eigenvalues ne, and gives the corresponding traditional 
Bianchi type. Missing from the Table is Type III, which is a special case of Type VI with k = 1, if ng and 
ng are scaled to +1. Type III is distinguished by the fact that all three eigenvalues of the matrix n% (the 
full matrix, including both symmetric and antisymmetric parts) degenerate to zero. 


17.4.4 Bianchi connections and curvatures 


The formulae in this section are valid for homogeneous spacetimes regardless of whether the structure 
constants cf, are spatially constant. 
The non-vanishing tetrad-frame connections are, from equation (11.53), 


Pavo — Taob = —Toap — Fab ’ Tabe _ i (Ceab + Chac — Cabe) ’ (17.96) 


where the overdot represents the time derivative, Yap = dYab/dt (an ordinary derivative because Yab varies 
only in time, not space), and Ccap = Veal. The connections with one time 0 index are symmetric in their 
spatial indices ab, while the purely spatial connections Pabe are antisymmetric in their first two indices ab. The 
tetrad frame is locally inertial (freely falling and non-rotating), as follows from the fact that the acceleration 
and precession both vanish, Pao0 = T'jasjo = 0. Altogether there are 6 + 9 = 15 distinct non-vanishing 
connections. If the structure coefficients cf, are spatially constant, then so are the spatial connections Pabc, 
but more generally the spatial connections can vary in space. For example, the spatial connections are 
spatially variable in all of the variants of the FLRW line-element given in Chapter 10 (although FLRW 
spacetimes can be realised as Bianchi spacetimes with constant structure coefficients — see §17.5). The 
spatial connections Pabe also vary in time because, whereas c4, with one index raised is constant in time, the 
coefficients Ceab = eae with all indices lowered depend on time through the time-dependent metric Yea- 
Explicitly, 


Dave = 4 (Yeah, + ‘acne = Yaacfe) : (17.97) 
The unique non-vanishing contraction of the spatial connections Tabe is 
Ipa = Cab = Eaben™ = —2kp , (17.98) 


which is constant in time. 
The extrinsic curvature Ka» is by definition 


Kap = Taob — Fab ; (17.99) 


with trace 


dln 
K = K? = ba = SY (17.100) 
where y = |Yap| is the determinant of the spatial tetrad metric. The last step of equations (17.100) is 
an application of equation (2.77). The proper spatial volume element is d°x = ane; so the trace K 
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measures the logarithmic rate of change of a comoving volume element. Positive K means that the comoving 
volume element is expanding, while negative K means that the comoving volume element is contracting. 

The tetrad-frame Riemann curvature tensor Rkimn, which homogeneity requires to be spatially constant 
regardless of whether the structure coefficients are spatially constant, is, from equations (17.40), 


Raovo = —Kas + Ki Krc, (17.101a) 
Raseo = T4 Kaa — Th, Kay + (14, —T¢,) Kea ; (17.101b) 
Rabed = Ravea = KoaKada + Kca Kab ; (17.101c) 


where Rabca is the restricted Riemann tensor, 


Rabed = Oal cab — l eda + Tel eda — Cam + (TS — Tha) Dede y (17.102) 


If the structure coefficients are spatially constant, then the two derivative terms on the right hand side 
of equation (17.102) can be dropped. For spatially constant structure coefficients, equations (17.101) and 
(17.102) along with equations (17.96) give the Riemann tensor in terms of the structure coefficients c$, 
and the tetrad metric Yab, without the need for an explicit form for the vierbein efa. If the structure 
coefficients were derived from an explicit vierbein, then the usual symmetries of the Riemann tensor (with 
vanishing torsion) would be guaranteed. But the symmetries are ensured in any case, since for constant 
structure coefficients the Jacobi identity (17.94) implies that the restricted Riemann tensor satisfies the 
cyclic symmetry ebed R aad = Ayan” ke = 0, which in turn ensures that the restricted Riemann tensor Ries 
is symmetric in ab © cd, Exercise 11.6. 
Contracting the Riemann tensor yields the Ricci tensor Rkm, 


Ro =- K - K”Ka, (17.103a) 
Rao = T}, KE -TS Ke (17.103b) 
Rab = Rap + Kab — 2KEKpe + Kak , (17.103c) 


where Rap is the restricted Ricci tensor, 
Res Oh + OLG HTE TE TEE (17.104) 


Again, if the structure coefficients are spatially constant, then the two derivative terms on the right hand 
side of equation (17.104) can be dropped. And again, for spatially constant structure coefficients, the Jacobi 
identity (17.94) ensures that the restricted Ricci tensor is symmetric, Biat] = —2EabaNT ke = 0. Contracting 
the Ricci tensor yields the Ricci scalar R, 


R=Ê+2K +K”Ka +K’, (17.105) 


where R is the restricted Ricci scalar. 
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17.4.5 Gravitational equations of motion for Bianchi spacetimes 


The assumption of spatial homogeneity implies that the energy-momentum tensor of a Bianchi spacetime 
can vary in time but must be spatially constant. The components of the energy-momentum tensor Tmn are 
the energy density p(t), the energy flux f,(t), and the pressure pay (t), 


To =p, Lao=—fa, Tab = Par - (17.106) 


The trace of the energy-momentum tensor is T = —p + 3p where p = ipa, In the special case of a perfect 
fluid at rest in the tetrad frame (which is not being assumed here), the energy flux fa would vanish, and 
the pressure tensor would be proportional to the spatial metric tensor, Pab = PYab- The ADM equations of 
motion for a Bianchi spacetime are, equations (17.52) and (17.63), 


dYab 
—— = 2K, 17.1 
dt b ( 7 07a) 
dK ap c r 
hh — 2K; Krc + Kab K + Rab = 40 [2Pab + Yabl P = 3p)| i (17.107b) 
The Hamiltonian constraint and the momentum constraints are 
1(_ K%K,, + K? + R) = 8p, (17.108a) 
TÈ KE —1¢,K® = -8r fa . (17.108b) 


Equations (17.107) combine to yield a second order ordinary differential equation for the spatial tetrad metric 
Yap(t). The spatial tetrad metric can be thought of as an ellipsoid, described by the lengths of its 3 axes, 
and 3 rotation angles. The general solution to equations (17.107) is a tetrad ellipsoid that evolves in both 
size and rotation. Equation (17.103a) gives an equation for the evolution of the expansion rate K of the 
comoving volume element, 


K =— K” Ka — 4r(p + 3p) , (17.109) 
which is the same as the trace of the equation of motion (17.107b) minus twice the Hamiltonian con- 


straint (17.108a). Equation (17.109) is the Raychaudhuri equation (17.76) in a Bianchi spacetime. Since the 
spatial metric yap is positive definite (all positive eigenvalues), K®’ Kap is positive. 


Exercise 17.3. Geodesics in Bianchi spacetimes. Solve for the geodesics of particles in a Bianchi 
spacetime. 
Solution. The effective Lagrangian of a particle can be taken to be 


where p™ = e™ „ dx“ /dA is the tetrad-frame 4-momentum of the particle (not to be confused with pressure 
p). There are 3 integrals of motion p, associated with the 3 Killing vectors Ya, plus 1 integral of motion 
associated with conservation of rest mass m, 


Pa = constant (a = 1,2,3), p"p,=—m?. (17.111) 
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The rest. mass equation implies that the time component of the tetrad-frame 4-momentum is 


p° = V Ypap +M? . (17.112) 


The time component of the momentum may equivalently be written 


p = y 9° paps +m? , (17.113) 


where pa = €*aPa- The coordinate 4-momentum is 


H 
—— = p" = {p', p°} = {p°, g na} = {p°, Yea Po} - (17.114) 


17.5 Friedmann-Lemaître-Robertson-Walker spacetimes 


Friedmann-Lemaitre-Robertson-Walker spacetimes are isotropic in addition to being homogeneous. FLRW 
spacetimes form a subclass of Bianchi spacetimes for which the 3 scale factors aa are all equal. Applying 
the vierbein from Table 17.2 with all three scale factors equal reveals that Type IX includes a strictly 
closed FLRW universe, while Types V and VII include an open FLRW universe. The special case k = 0, 
corresponding to Types I and VIIo, yields a flat FLRW universe. 

Bianchi spaces have spatially constant structure coefficients by assumption, but none of the various versions 
of the FLRW line-element given in Chapter 10 have constant structure coefficients. The non-constancy of the 
structure coefficients poses no obstacle to casting the Friedmann equations into ADM form. For example, 
the isotropic (Poincaré) form (10.26) of the FLRW line-element is 


4a? 
(+ a(o? +9? +2”) 


which is in ADM form with unit lapse and zero shift. The line-element (17.115) takes ADM form (17.90) 
with spatial tetrad metric 


ds? = — dt? + 


5 (dx? + dy? + d2?) , (17.115) 


Yab = 4° fab 5 (17.116) 
and spatial vierbein 
o 260 
1 +k? + y? + 2?) © 


a 
Ea 


(17.117) 


The structure coefficients, equation (17.93), have zero symmetric part, and non-constant antisymmetric part 
given by 
ke = Ka", (17.118) 
The extrinsic curvature Kap is 


Kab = aa dap , (17.119) 
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and its trace K is 3 times the Hubble parameter, 


ra, (17.120) 
a 
The restricted Ricci tensor Ray is 
Rap = 2K ban , (17.121) 
and the restricted Ricci scalar R is 
a Ók 
BS (17.122) 
The Hamiltonian constraint (17.131) is 
3; 
zz (a7 +K) =8rP, (17.123) 
which reproduces the first of the Friedmann equations (10.30). The equations of motion reduce to 
a a? K 
a +25 +25 =4n(p—p). (17.124) 


With a factor of the Hamiltonian constraint (17.123) subtracted, the equation of motion (17.124) becomes 


na 
© = (p+ 3p) , (17.125) 
a 3 


which reproduces the second of the Friedmann equations (10.30). Equation (17.125) is the Raychaudhuri 
equation (17.76) for an FLRW spacetime. 


17.6 BKL oscillatory collapse 


An application of Bianchi spacetimes that is of particular relevance to black holes is the collapse of a 
Type VIII or IX Bianchi spacetime to a singularity, which shows a complicated oscillatory behaviour called 
Belinskii-Khalatnikov-Lifshitz (BKL) oscillations (Belinskii, Khalatnikov, and Lifshitz, 1970; Belinskii and 
Khalatnikov, 1971; Belinskii, Khalatnikov, and Lifshitz, 1972; Belinskii, Khalatnikov, and Lifshitz, 1982; 
Belinski, 2014). BKL oscillations are also called mixmaster oscillations. The prototypical BKL model is 
a Bianchi spacetime, which is spatially homogeneous, but Belinskii, Khalatnikov, and Lifshitz (1982) argue 
that oscillatory behaviour is generic for collapse to a singularity in general inhomogeneous spacetimes. See 
Berger (2002) and Belinski (2014) for reviews. 

In BKL collapse, the comoving volume element decreases monotonically to zero in a finite proper time, 
but one spatial axis always expands while the other two collapse. When one of the collapsing axes becomes 
sufficiently small, it “bounces” and starts expanding, while the previously expanding axis turns around and 
starts collapsing. Although the behaviour is deterministic, the sensitivity to initial conditions makes it look 
chaotic. Bounces occur irregularly in logarithmic time, so that there is an infinite number of bounces during 
the finite proper time that it takes to reach the singularity. Of course, this ignores quantum gravity, which 
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presumably does something once either the density or the curvature reaches the Planck scale. Between 
BKL bounces, the three spatial axes expand or contract approximately as power laws aa & t% in time t 
with different exponents qa, following a behaviour discovered by Kasner (1921), Exercise 17.4. BKL call the 
phases between bounces Kasner epochs. 

The simplest BKL model is where the axes of the tetrad ellipsoid yab(t) of the Bianchi spacetime change 
in size, but they do not rotate. This is the model pursued in this section, and that you will explore in 
Exercise 17.7. Belinskii, Khalatnikov, and Lifshitz (1982) show that when rotation is included, each BKL 
bounce changes not only the expansion or contraction of each axis, but also changes its orientation. The 
behaviour between bounces remains Kasner. 

The equations of motion (17.107) for a Bianchi spacetime show that non-rotating solutions for the tetrad 
metric Yap exist if the restricted Ricci tensor Rap and the pressure tensor pap are diagonal in the frame where 
the tetrad metric is diagonal. For such solutions, the extrinsic curvature Ka» is diagonal in the frame where 
the tetrad metric is diagonal, and the momentum constraints (17.108b) then imply that the energy flux fa 
vanishes. All Bianchi Types except IV include solutions for which the restricted Ricci tensor is diagonal. 

The tetrad metric Yap in the non-rotating diagonal frame is conveniently written in terms of scale factors 
dq along each of the three diagonal directions, 


Yablt) = a Oab « (17.126 
The corresponding diagonal extrinsic curvature Ką, is then, from equation (17.107a), 
Kap = Qada Sab . (17.127 


The pressure is diagonal by assumption, with pressure pa in the a’th direction, 


Pab = Pa Sab - (17.128 


The equation of motion (17.107b) for the extrinsic curvature Ka, involves the restricted Ricci tensor Hes 
A feature of Bianchi spacetimes (with spatially constant structure coefficients) is that the restricted Ricci 
tensor Rab, equation (17.104), is given in terms of the structure coefficients c£, and the tetrad metric Yab 
without the need for an explicit expression for the vierbein. In most (Type VI with k 4 0 is an exception) 
of the solutions for which Êa» is diagonal in the frame where the metric is diagonal, including the BKL 


(cd) of the structure coefficients is diagonal in the same frame. In this case, 


solutions, the symmetric part n 
the components of the restricted Ricci tensor Rap (17.104) are, in terms of the scale factors a, and the 


parameters ne and ke of the structure coefficients, equation (17.92), 


R ag? (Toman 2kt _2k3 2k niai gap _ naag (17.129a) 
11 1 a? as as 2a3.a2 2azat 2a?a2 , A 
M naz — n3a3 
EEN (227) (17.129b) 
ai 


and similarly with permuted indices for the other components. The off-diagonal components Ro3 and company 
must vanish for the restricted Ricci tensor to be diagonal. Equation (17.129b) shows that one possibility, 
which covers the majority of cases (Type VII with k = kı 4 0 and \/ngazg = ,/n3az3 is an exception), is 


516 


Conventional Hamiltonian (3+1) approach 


Table 17.2: Bianchi spatial vierbein yielding a diagonal restricted Ricci tensor 


Type eta 
1 0 0 1 a 0 
I 010 010 
00 1 0 0 1 
1 0 0 1 
V 0 et? 0 0 “ ; 
0 0 e 0 en 
1 0 0 1 0 0 
II 0 1 0 12 
0 -z 1 0 0 1 
1 0 0 0 
Vio 0 cosha -—sinhz ; M x sinhg 
0 —sinhxz coshg 0 sinhz coshg 
1 0 0 1 0 0 
MI 0 ia te?e 0 e” e” 
0 -3 4 0 -1 1 
1 0 0 1 0 0 
VI 0 de (kt ))e 1 e` (k+1)z 0 (k+1)x (k+1)x 
0 oe ee Lee 0 e(k-1) (k-1)x 
1 0 0 1 0 0 
VIIo 0 cosx sing 0 cosx sing 
0 —sing cosg 0 —sing cosg 
1 0 0 1 0 0 
VII (with az = a3) 0 e-*cosx e** sing 0 e*cosx e** sing 
0 -e-**sing e— "cosa 0 ke sing e** cosg 
1 0 sinh y 1 0 0 
VIII 0 cosx sing cosh y —singtanhy cosx sing sech y 
0 —sinx cosg cosh y —cosxtanhy —sing cosg sech y 
1 0 sin y 1 0 0 
IX 0 cosx singcosy singtany cosx  sinzsecy 
0 —sing cosxcosy cosxtany —sing cosxsecy 
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that the antisymmetric part ke of the structure coefficients vanishes identically (that is, k = kı = 0). This 
is the solution pursued here, since it is the one that leads to the BKL solutions. In this case, where the 
symmetric part of the structure coefficients is diagonal in the frame where the metric is diagonal, and where 
the antisymmetric part vanishes identically, the ADM equations of motion (17.107) imply that the equation 
of motion for the scale factor a, is 


. . . . 2 2 2 pi 2 P 
ay a142 a143 n2n3 nađi NIQ3 n343 Ry 

+ H AE To 5 z5 = — = Ar (2p1 +p- 3p) , (17.130) 
a, @ıa2 @1a3 aî 2a5a3 2aĝaí 2aĵjas aî 


and like equations with permuted indices for ag and a3. The Hamiltonian constraint is 


242 2,2 2,2 
ngN3 N31, l NIN2 niai N a5 n3 a3 
| 


T 
a203 a301 Q102 2a? 205 2a2 4a3.a3 4azaz 4a? az 


Gas | dot „ tide = 879 . (17.131) 
You will explore how these equations lead to BKL oscillatory collapse to a singularity in Exercise 17.7. 

A central part of the Belinskii, Khalatnikov, and Lifshitz (1982) argument that BKL oscillations are generic 
in gravitational collapse to a singularity, as opposed to an artefact of the assumption of spatial homogeneity, 
involves the dependence on time of the terms in the equations of motion (17.130) (which are really just the 
Einstein equations). The terms involving scale factors aa but not their time derivatives act as “potentials” 
that are responsible for BKL bounces when one of the collapsing scale factors becomes sufficiently small. The 
potentials arise from the products of spatial connections in the restricted Ricci tensor Rab, equation (17.104). 
The form of the dependence of the restricted Ricci tensor on the scale factors follows from the fact that the 
restricted Ricci tensor (17.104) is proportional to two powers of the contravariant metric y°?, and two powers 
of the covariant metric Yea, and that one of the indices on one of the powers of the covariant metric must 
be one of the indices a or b of the Ricci component Ray. This form of the dependency of the Ricci tensor on 
the metric is generic. 

Even though they are not needed in order to write down the Einstein equations, Table 17.2 lists explicit 
expressions for the spatial vierbein yielding a diagonal restricted Ricci tensor, which exist for all Bianchi 
Types except IV. The coordinates are scaled so that the eigenvalues of the structure coefficients are all na = 0 
or +1. For the tabulated Types with k = 0, the time-space components Roa of the Ricci tensor also vanish 
identically. For the tabulated Types with k 4 0, the time-space components Roa of the Ricci tensor do not 
all vanish identically, and their vanishing must be imposed as constraints on the initial conditions. 

The notable exception mentioned at the beginning of §17.4 of a homogeneous space that cannot be realised 
as a Bianchi space (the vierbein cannot be chosen such that structure coefficients c$, are spatially constant), 
at least as long as the structure coefficients are taken to be real, is the closed (x > 0) cylindrical space 
realised by the spatial vierbein 


1 0 0 1 0 0 
ea= | 0 cos(ifer) 0 |, ea =| 0 sec(ykz) 0 |. (17.132) 
0 0 1 0 0 1 


An open (« < 0) cylindrical space on the other hand can be realised as a Bianchi space of Type III, with 
the spatial vierbein given in Table 17.2, with « = —k/2 = —1/2. 
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Exercise 17.4. Kasner spacetime. The Kasner (1921) line-element is 
ds? = — dt? + a2dz* + az dy? + adz? , (17.133) 


where aa(t) are functions only of time t. What Bianchi type is the Kasner line-element (17.133)? Show that 
the Kasner line-element (17.133) solves the vacuum Einstein equations if 


aa = |t|% (17.134) 
with 
qs tqy +4: =1, Gtatg=1. (17.135) 
Show that a parametric solution of equations (17.135) is 
—u l+u u(1 +u) 
= —— = ———; = ———— . 17.136 
dx ltutu2’ dy ltutu2 ’ qz 1+u+u? ( ) 


Plot the qa versus u. Show that, if qa are ordered such that qı < q2 < q3, then 


-<q <0 <g<1. (17.137) 


Solution. Type I. 


Exercise 17.5. Schwarzschild interior as a Bianchi spacetime. Inside the horizon of the Schwarzschild 
geometry, where the horizon function A is negative, the Killing vector associated with time translation 
symmetry becomes spacelike, so the spacetime has three spacelike Killing vectors, and is therefore spatially 
homogeneous. The line-element inside the horizon is 


ds? = — dR? + |A|dt? + r? (d0? + sin?6 de?) , (17.138) 
where dR = dr/,/|A]. The line-element (17.138) is in the form (17.90) with time coordinate R, spatial 
coordinates t, 0, 6, spatial tetrad metric 

Yab = diag(|A], r?, r°), (17.139) 
and spatial vierbein and inverse vierbein 
efa =diag(1,1,sin0) , ea” = diag(1, 1, 1/sin 9). (17.140) 


What Bianchi type is the Schwarzschild line-element (17.138)? Show that the Schwarzschild interior looks 
like a Kasner geometry near the singularity. 
Solution. Type V. The interior near the singularity is Kasner (17.133) with t « r°/?, and q = —4, 
q2 = 93 = 2, 
Exercise 17.6. Kasner spacetime for a perfect fluid. A generalization of the Kasner line-element (17.133) 
is 

ds? = — dt? + X` a? dz? , (17.141) 
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with scale factors 
aa = a T713 | (17.142) 


where a(t) and T(t) are functions of time t, and the constants qa are Kasner coefficients satisfying equa- 
tion (17.135). The overall scale factor a satisfies 


a = (a1a2a3)!® . (17.143) 


1. Show that the Einstein tensor corresponding to the Kasner line-element (17.141) is diagonal. 
2. Show that the energy-momentum is that of a perfect fluid (i.e. the pressure is isotropic, with tetrad-frame 
pressures pa = Taa = p all equal) provided that a and T are related by 
dt 1/3 
a = | 3K —— j 17.144 
( dln 7) ( ) 
where K is a real constant. Notice that the Kasner spacetime is not isotropic even though the energy- 
momentum is isotropic. 


3. Show that in this case of a perfect fluid the tetrad-frame Einstein equations are 


& K? 
Goo =3 (i = =) = 87 p ; (17.145a) 
2G z9 3K? 
Ga =- 2 2 - = Bap. (17.145b) 
a a a 


The Einstein equations (17.145) resemble those (10.29) of the FLRW geometry except that the curvature 
terms «/a? in FLRW are replaced by terms proportional to —K?/a®. 

4. The Hubble parameter is defined by H = a/a as in FLRW. Conclude that the evolution of the scale 
factor a(t) with time t is determined by the same equation (10.70) as for FLRW, 


da 

t= | —. 17.146 

la (17.146) 

5. Show that the Einstein equations (17.145) enforce that the energy-momentum of the perfect fluid satisfies 
the first law of thermodynamics, similarly to FLRW, §10.9.2, 

dpa® da? 

p— 

dt dt 


=0. (17.147) 


6. From the first law of thermodynamics, show that for a perfect fluid with equation of state p/p = w = 
constant, the density p is related to scale factor a by, as in FLRW, 


peg OY) , (17.148) 


7. More generally, as in FLRW, the energy-momentum may comprise multiple perfect fluid components x 
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satisfying the first law (17.147). The critical density perit is defined in terms of the Hubble parameter H 
in the usual way by equation (10.46). Argue that the Kasner Einstein equation (17.145a) implies that 


3H? 


sa = Perit = zo 17.14 
s Elak Dd) p (17.149) 


species x 


which differs from FLRW, equation (10.72), in that the FLRW curvature density pk x a~?, equa- 


tion (10.48), is replaced by the Kasner curvature density px x a~°, 
3K? 
= —. 17.1 
PK = 8708 ely) 


The Kasner curvature density px behaves like a perfect fluid with positive energy and an ultra-hard 
equation of state, w = 1. 

8. Define ax and Hx to be the cosmic scale factor and Hubble parameter at density-curvature equality, 
where p = pK = 5 perit- Show that 


H 
K = a (17.151) 
V2 
9. From equation (17.144) conclude that T equals an integral over scale factor a, 
da 
Conclude that for a single perfect fluid with p/p = w = constant, 
3 
Tie (a/ax) EE (17.153) 
[1 + V4 + (a/ag)30 =] 
Conclude that the small and large a limits of T are, for w < 1, 
3 
aJaK a KaK, 
T/Tk > HR (17.154) 
a>axK. 


Hence conclude that the perfect fluid Kasner solution goes over to vacuum Kasner for small a and to 
FLRW for large a. The solution approximates vacuum Kasner at small a not because physical den- 
sities are going to zero, but rather because the density becomes dominated by the Kasner curvature 
density (17.150). 

10. For the particular case of a cosmological constant, w = —1, show that K = /A/3, and that 


ajag =sinh/3(V3At), T/T = tanh(V3At/2) . (17.155) 


Exercise 17.7. Oscillatory Belinskii-Khalatnikov-Lifshitz (BKL) instability. The contracting phase 
of a Type VIII or IX Bianchi spacetime provides a model of collapse to a singularity that illustrates how 
complicated such a collapse can be (Belinskii, Khalatnikov, and Lifshitz, 1982). Type VIII and IX Bianchi 
spacetimes have all three eigenvalues na non-zero, and ka therefore necessarily all zero. 
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1. Define qa by 


dina, 
a = 3 17.1 
de= Tnit] eee) 
and let q be their sum, 
dln(a a2a3) 
= ee 17.157 
q= dni ( ) 


Note that in a collapsing spacetime, t is negative and tending to zero, and In |t| + —oo as |t| > 0, so qa 
is positive for a collapsing scale factor aa. Define further 


Ag = Nat . (17.158) 


Show that, for vanishing energy-momentum, the equations of motion (17.130) are 


dq 1 t \? 
u= Ag As = Aj 17.1 
ding TBODY =5 +) [(A2 — As)* — Aq] , (17.159a) 
dq2 o1 t 2 2 : 
ding, * 24 -Y=5 G) [(A1 — As)? — A3] , (17.159) 
dq3 1 t z E > 
dln |t| w(q—1) = 2 G) [A = Ag — Aj] , (17.159c) 
and that the Hamiltonian constraint (17.131) is 
¢- n= Er [2(47 + A3 + 43) — (Ar + A2 + As)?] . (17.160) 
: 4 a1 a203 3 


. In gravitational collapse, the scale factors a, might be expected to become small. Argue that if the right 
hand sides of equations (17.159) and (17.160) are neglected, then the solution is the Kasner solution, 
with qa constant, satisfying equation (17.135). 

. In the Kasner solution, the qa satisfy the inequalities (17.137). Argue that if qa are ordered qı < 
q2 < q3, then Kasner evolution tends to drive the A, so that |A;| > |A2| > |A3|. Then argue from 
equations (17.159) that the effect of the right hand sides is to drive smaller qa to increase, and larger 
da to decrease. 

. Explore the evolution of the scale factors a, numerically. Choose either Type VIII or Type IX: they are 
equally fun. You will find better numerical behaviour by transforming to a time variable 7 defined by 


d d 14243 d 
— = = 17.161 
dr PSH t dinit’ ( ) 
which increases as t increases and In |t| decreases. Define 
dln |Aa 
Qa=-3 ol oy, . (17.162) 


dt —t 
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Figure 17.1 Left panel: Cosmic scale factors aa in BKL collapse of a Bianchi Type IX spacetime (with eigenvalues 
normalized to na = 1). The thick (black) line is the geometric average (a1a2a3)!/3 of the scale factors, which is 
proportional to the cube root of the comoving volume element. Right panel: Logarithmic derivatives qa of the scale 
factors, equation (17.156). The thick (black) line is the sum q = qi + q2 + q3 of the logarithmic derivatives, which 
asymptotes to 1 as collapse proceeds. The initial conditions were aj = a2 = a3 = 1 and such that the comoving volume 
element was initially barely collapsing, Q1 = $, Q2 = 0, Q3 = z, whence 5° Qa = 5 In the initial conditions, 
the Hamiltonian constraint (17.164) determines the third Qa in terms of the other two. Integration established a 
posteriori that the initial time was to = —1.6859987. By the end of the plotted era, where r = 10°, the comoving 
volume element had shrunk to ajazga3 ~ 107229. 


which has the same sign as qa. Show that the equation of motion for Aj is 


dQ 


= 1 [A? — (Az — 4;)°] , (17.163) 
and similarly for Ag and A3. Show that the Hamiltonian constraint is 
Q2Q3 + Q3Q1 + Q1Q2 = 5(Aj + A3 + 43) — (Ai + A2 + As)” . (17.164) 


The equation of motion for t/(a1a2a3) tends to become unstable when aazag is small. These circum- 
stances are precisely those where q = 1 to good accuracy. Thus when instability arises for small a1a2a3, 
it can be worked around by enforcing q = 1. 


. Show that for energy-momentum with equation of state p = wp, the proper energy density p varies as 


p x (aragaz) H®) , (17.165) 


Show that including energy-momentum in the equations of motion amounts to adding terms proportional 
to (a1a2a3)?p on the right hand sides of equations (17.163). By comparing these terms to the largest 
Aq terms on the right hand side, conclude that the influence of energy-momentum is sub-dominant as 
t| > 0. 
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6. Following Belinskii, Khalatnikov, and Lifshitz (1970), show that as |t| — 0 the collapse may be described 
as a sequence of Kasner epochs punctuated by bounces. The Kasner exponents qa before a bounce are 
given by equation (17.136) for some u > 1. After the bounce, the exponents qa satisfy the same equation 
with u flipped, 


u—>-—u. (17.166) 


For u > 2 the flip reorders the smaller pair of qa while the largest qa remains the largest. For 1 < u < 2 
the flip takes the smallest qa to the largest, leaving the other pair in original order. To prepare for the 
next bounce, reset u > 1 by transforming 


— 1)! <u< 
waf ye Asusa, (17.167) 


Solution. Figure 17.1 illustrates an example computation. To avoid premature overflow, the computation 
used logarithmic quantities ln aa and In ||t|/(a1a2a3)] as variables. 


17.7 Numerical considerations 


Numerical experiments during the 1990s established that the ADM equations, whether in the original form 
with momenta Tap, or in the York-modified form with momenta Kab, are numerically unstable. 

The most popular formalism for long-term evolution of spacetimes is the Baumgarte-Shapiro-Shibata- 
Nakamura (BSSN) formalism (Shibata and Nakamura, 1995; Baumgarte and Shapiro, 1998), and variants 
thereof (Shinkai, 2009; Baumgarte and Shapiro, 2010; Brown et al., 2012). The BSSN formalism differs from 
ADM in that it adjoins equations of motion (17.181) for a vector set of 3 BSSN momentum variables A, 
and treats the definition (17.180) of Ĥa in terms of derivatives of the metric as a constraint equation. The 
BSSN equation was discussed in the language of multivector-valued differential forms in §16.16.2. 

The superior numerical stability of the BSSN over the ADM formalism can be attributed to the fact that 
BSSN is strongly hyperbolic, §17.7.1, whereas ADM is only weakly hyperbolic (Kreiss and Ortiz, 2002; 
Nagy, Ortiz, and Reula, 2004). 


17.7.1 Strong hyperbolicity 


For numerical work, it is not sufficient to have an integrable set of equations. Integrability does not guarantee 
good numerical behavior, if small errors in the initial conditions blow up exponentially. A condition that 
guarantees good numerical behavior is that the system be strongly hyperbolic (Kreiss and Ortiz, 2002; Nagy, 
Ortiz, and Reula, 2004; Hilditch, 2013). Loosely speaking, strong hyperbolicity requires that perturbations 


at 


iwt rather than growing exponentially ~ e®%*. 


to initial conditions propagate as waves ~ e 
Strong hyperbolicity for a first-order system of partial differential equations is defined as follows (Hilditch, 
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2013). Let u; denote a set of variables satisfying the first-order system 


Ou; Ou; 
AO sss Si) 17.168 
where --- does not involve derivatives of the variables. The matrix Aç; for each spatial coordinate direction 


a is called the principal symbol of the system. The system is called weakly hyperbolic if, for every direction 
a, all the eigenvalues of the principal symbol are real. The system is called strongly hyperbolic if in addition, 
for every a, the eigenvectors of the principal symbol form a complete set, and the eigenvector matrix and 
its inverse are uniformly bounded. 


17.8 BSSN formalism 


BSSN reorganizes the second derivative structure of the spatial Einstein equations so that their behaviour 
as wave equations for the spatial metric gag is manifest, equation (17.183). Only the 5 trace-free spatial 
Einstein equations are genuine wave equations. The spatial trace of the Einstein equations is a non-wave 
equation, the Raychaudhuri equation (17.76). 

The Hamiltonian structure of the BSSN formalism was explored previously, in the language of multivector- 
valued differential forms, in §16.16.2. 


17.8.1 BSSN momentum equation 


In the BSSN formalism, the momentum equation is treated as an equation of motion for the evolution with 
time t of a momentum variable H,,. To identify what this momentum variable H. is, it is most straightforward 
to start not with equation (17.40b) for the Riemann components Rabco, as does ADM, but rather with 
equation (17.40c) for Roa. The ab + cO symmetry of the Riemann tensor Rabco means that the two 
expressions are identical when expanded in terms of vierbein derivatives, but the two expressions package 
the connections and their derivatives in different ways. The restricted contribution Pipes to the Riemann 
tensor, equation (17.42), involves dol abe; which is a time derivative of an expression [vie involving spatial 
derivatives of the vierbein, which looks promising as a precursor of an object whose time evolution might 
be governed by a momentum equation. However, the other derivative 3 ato in Reoab also includes mixed 
time-space second derivatives of the vierbein. 

As with the earlier ADM equations of motion (17.55) for gag and (17.68) for Kag, the identity of the object 
whose time evolution is being governed becomes manifest in the coordinate frame, where the spatial tetrad 
is set equal to the spatial coordinate tangent axes, equation (17.14). The desired equation for the coordinate- 
frame Riemann components Rytag can be derived from a combination of equations (17.40c) and (17.40d), 
but is obtained more directly from the general equation (17.34), with the restricted Riemann components 
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from equation (17.35), 


Rytap = Rytop + Keiko, = KatKoy > (17.169a) 
. Ol née. afa _ oe ke 
Rytap = oe — ae + Peal’, a T (17.169b) 


where the greek indices are a reminder that this is a coordinate-frame expression, and where the final terms 
in equations (17.34) and (17.35) vanish because of the symmetry T4, = IY of coordinate connections 
(Christoffel symbols), for vanishing torsion. As shown below, equation (17.174), the index on the restricted 
coordinate connections in equation (17.169b) is raised with the spatial coordinate metric, not with the full 
metric, 


PS, = 9° Papu » (17.170) 


Thanks to the ADM gauge condition e°,, = 0, the non-vanishing components of the coordinate-frame gen- 
eralized extrinsic curvature Kyy = e!ye™ ue" Kimn are, similarly to the tetrad-frame generalized extrinsic 
curvature Kımn, those whose first two indices are one spatial a and one time t index, 


Kot = aKa ; (17.171) 


which like the tetrad-frame generalized extrinsic curvature is antisymmetric in its first two indices at. The 
extrinsic curvature is as usual Kag = Kaog = eae? g Kaob, which is symmetric in a8, while the acceleration 
is as usual Ka = Kaoo = e'a Kaoo. The tensor Kat in equation (17.169a) is 


Kot = Kaot = e”, Kaom = aKa — B Kas - (17.172) 


The decomposition Papo = ji + K),,, equation (17.27), holds for coordinate connections, but the coor- 
dinate connections differ from the tetrad connections by a vierbein derivative, equation (11.44). Thus the 
restricted coordinate-frame connections I'),,, are related to the restricted tetrad-frame connections limn by 


Pig = ele” ne oa Îimn) - (17.173) 


The vierbein derivative doan with first index 0 and second index a spatial vanishes because of the ADM 
gauge condition e°a = 0. For convenience, define the restricted coordinate connection with first index a tetrad 
index k by Cee = eo Tyne Since doan = 0 it follows that the coordinate-frame connection io vanishes 
like its tetrad-frame counterpart. Consequently the product of coordinate connections Pral contracted 


with the full coordinate metric g7? equals the product Tal, contracted with the spatial metric 9°‘, 
Pratl g = Toal ha = Paat Sx = al ’ (17.174) 


which justifies equation (17.170). 
The coordinate connection Ijas antisymmetrized over its spatial indices a is an antisymmetric spatial 
tensor, which can be denoted Fag, 
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The tensorial nature of Fag follows from the fact that the coordinate-frame curl of a vector is a tensor, 
Exercise 2.6. Expression (17.169b) for the restricted Riemann tensor can thus be written 


a a or aß a ee igs a a 
Êntag = Ô,Fag — ei E ees ee he (17.176) 


in which the only term containing mixed time-space second derivatives is OV ias] /3t. In equation (17.176), 
the coordinate connection Piaj symmetrized over its spatial indices is 


a 1 agag 
T = 2 , 17.177 
(abt — 2 Ot ( ) 


Contracting the Riemann tensor Rytag yields the time-space components Ria of the Ricci tensor, 


Ria = Ria — KP Kap + Kak , (17.178a) 

a `, Iag a g i g 
Ria = DP Foa + Pea g +ECO ae, (17.178b) 
in which the only term containing mixed time-space derivatives is OT taal” /Ot. In terms of derivatives of the 
metric, Dias] Bis 


j = 1 ey (Bs 2a) _ 1 Jay O(99"") 


2 Ox® — Ox 2g Ox? CRG) 


Dias 
where g = |ga| is the determinant of the spatial metric. Equations (17.178) show that the variable Dial’ 
appears to be the desired BSSN momentum variable. However, it is common to use a variant BSSN mo- 
mentum variable H, in which the spatial metric is scaled by some power of the spatial metric determinant 
g, 


a a polng x (1+ p)Olng Ja ð (gt-P)/? g87) 
A, = [agf 4 = 210417 = y 17.180 
P T 2 Axe ap ze g0) eB i ( ) 
with p an adjustable constant. For example, the choice p = —1 recovers (twice) the original momentum 


variable Tio’, the choice p = 0 yields a spatial Ricci tensor (17.182) whose only explicit second spatial 
derivatives are a Laplacian of the spatial metric, and the choice p = 1/3 gives an H., that depends only on 
1/3q... with unit determinant (and its inverse g!/g°7). In the BSSN formalism, 
the evolution of the momentum variable Fa is governed by the momentum equation 


the scaled spatial metric g7 


LOH, _ (+p) Ing 
2 ðt 4 otdx 


+ Êsa a — PO Pasg + Do Fag + KP Kop — KatK + 80Tia|. (17.181) 


In the BSSN formalism, equation (17.180) is a constraint equation, which must be imposed in the initial 
conditions, but which is satisfied automatically thereafter. 
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17.8.2 BSSN spatial Ricci tensor 


In the BSSN formalism, the spatial components Rap of the restricted Ricci tensor are recast in terms of the 
BSSN variable Hą, 


^ p lng g® gag 10H, 10H, as A ag A sik disp a8 
Ras = £8 i571 Ea as Eang tL al ys 

£ 2 0x°dxP 2 OxVOx° ai 2 Ox j 2 Ox? apr by t Ba ti Bh and + 188 
(17.182) 


The only explicit second spatial derivatives in the expression (17.182) for Rag are a double gradient of 
the spatial metric determinant g, and a spatial Laplacian of the spatial metric gag, the remaining second 
derivatives having been absorbed into first spatial derivatives of the BSSN momentum variable Ha. 

When the restricted spatial Ricci tensor (17.182) is inserted into the equation of motion (17.68) for the 
extrinsic curvature Kag, the spatial Laplacian combines with a second time derivative coming from 0K 4 /Ot 
to form a 4-dimensional wave equation for the spatial metric gag. Thus the character of the spatial Einstein 
equations as wave equations for the spatial metric gag is manifest in the BSSN formalism. Explicitly, the 
spatial Einstein equations, which are just the equations of motion (17.68) for the spatial extrinsic curvature 


Kag, are 
a ae o? A In(ag?/? 1 
( ) g’ | at lag (Tap = Zoest) l (17.183) 


2 | \aðt 0x1 Ox Ox Ox? 


where ... signifies terms involving no higher than first time or space derivatives of the lapse a, the shift 6°, 
the spatial coordinate metric gag, the extrinsic curvatures Kag, or the BSSN variable Hy. 

Commonly, only the 5 trace-free equations of motion for Kag are used in the BSSN formalism, the trace 
equation being replaced by the Raychaudhuri equation (17.76). 


17.8.3 BSSN summary 


To summarize, the dynamical variables in the BSSN formalism are the spatial metric gag, the spatial extrinsic 
curvature Kag, and the spatial BSSN variable H,,. The equations of motion for the dynamical variables 
are: 

1. the 6 equations (17.55) for the spatial metric gag; 

2. the 5 equations constituting the trace-free part of the 6 equations (17.68) for the spatial extrinsic 

curvature Kag; 

3. the 1 Raychaudhuri equation (17.76) for the trace K of the extrinsic curvature; 

4. the 3 equations (17.181) for the BSSN variable Ĥa. 
The constraint equations, which must be arranged to be satisfied on the initial hypersurface, but which are 
thereafter satisfied automatically are: 

1. the 1 Hamiltonian constraint (17.74a); 

2. the 3 momentum constraints (17.74b); 

3. the 3 constraints (17.180) on the BSSN variable Ĥa. 
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The Hamiltonian and momentum constraints are differential constraints, elliptic partial differential equations 
of second order in the spatial coordinates, which are in general non-trivial to set up. The constraints on H, on 
the other hand are algebraic constraints, which are straightforward to impose once the differential constraints 
are solved. 


17.9 Pretorius formalism 


Pretorius (2005) proposed an elegant 4-dimensional version of the BSSN formalism. A natural 4-dimensional 
generalization of the BSSN momentum variable Ña defined by equation (17.180) is (with p = 0) 


dn yg Grp O(g 9) 
H, = Taa = cia + Ər" = = aah j (17.184) 


in which g = |gx\| is the determinant of the full 4-dimensional metric. If the coordinates x” are treated as 


four scalars (they are not; and neither do they form a 4-vector), then the contravariant components H“ can 


be written as minus the (torsion-free) d’Alembertian O = D) DÀ of the coordinates, 
; 1 O6/-g9**) 1 ð yn Ox" . 
Bie v _ [~g gò = —On" 17.185 
Vg Ox yg Ox I9 Aah E ( ) 


which motivates calling H" the harmonic function. The coordinates x“ are not scalars, and neither is the 
harmonic function H“ a tensor. In the Pretorius formalism, the Ricci tensor takes the form 


Li gka LOR, One as N 7 a 
Rea = 59 ae l 3 par | 2 Oar Pee HI kT am I epu t I Tuwa (17-186) 


in which the only explicit second derivatives are those in the g”? gka /Ox"Ox" term. This second derivative 
term has the form of a 4-dimensional coordinate wave operator acting on the 4-dimensional coordinate metric 
grà- The Einstein equations are as usual 


Ried = 8r (Tha — $IndsT) : (17.187) 


Despite the covariant 4-dimensional character of the Pretorius formalism, it is still possible to make ADM 
gauge choices, §17.1, that is, to foliate the spacetime into hypersurfaces of constant time t, and to work in 
an ADM tetrad whose time axis Yo is the future-pointing unit normal to hypersurfaces of constant time t. In 
the ADM tetrad, the tetrad-frame harmonic function Hp = ek" H, with H,, defined by equation (17.184) is, 
in terms of the vierbein derivatives dkım defined by equation (11.33), the tetrad-frame restricted connections 
Dhim, and the generalized extrinsic curvature Kimn, 


Hk = en Ay, = diem” + Vem” = drm” + Ties + Keim” = dko? T A, = Ky ; (17.188) 


where Hy = dpa? + Vat = {0, Ha} = {0, €a° Ha}, and H, is the BSSN momentum variable defined by 
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equation (17.180) with p = 0. The tetrad-frame components H;, of the harmonic function are 


1 
Ho = — a -K 5 (17.189a) 
Q 


1 x 
H, = —€aa 8? + He — Ka l. (17.189b) 
Q 


Pretorius (2005) points out that the arbitrariness of the choice of coordinates x“ translates into an ar- 
bitrariness in the choice of the 4 components H, of the harmonic function. Thus instead of treating the 
lapse and shift as arbitrarily adjustable functions, the harmonic functions H,, can be adjusted arbitrarily. 
For example, the harmonic function can be chosen to vanish identically, H,, = 0, a coordinate condition first 
proposed by Fock (1957). Equations (17.189) can then be interpreted as evolution equations for the lapse 
a and the shift 6°. In this case the 4 Einstein equations with at least one temporal index are not used as 


evolution equations. 

However, it is also possible (Bona et al., 2003) to follow the BSSN strategy of choosing the lapse and 
shift arbitrarily, in which case the 4 Einstein equations (17.186) with at least one temporal index provide 
evolution equations for the harmonic function H,,, and equations (17.189) are constraint equations that must 
be imposed on the initial hypersurface, but which are guaranteed thereafter. 

As in ADM and BSSN, the Hamiltonian and momentum constraints, along with the conditions (17.185), 
must be arranged to be satisfied on the initial hypersurface. 


17.10 MN split 


In situations where fields are highly relativistic, such as inside black holes, or when following gravitational 
waves, it can be natural to work in a frame where some of the tetrad axes are null. A null direction +, is 
orthogonal to itself, yy Yv = 0, so it is not possible to carry out a 3+1 split of spacetime into a 1-dimensional 
space aligned with y, and a 3-dimensional space orthogonal to it. It is however possible, as in the Newman- 
Penrose formalism, to carry out a 2+2 split of spacetime into a 2-dimensional space spanned by two null 
directions y, and +y,,, and a 2-dimensional space orthogonal to the null directions. 

This section 17.10 considers the general case of an M+WN split of an M+N-dimensional spacetime. 


17.10.11 M+N tetrad and extrinsic curvature 


In an MN split of spacetime, the tetrad-frame axes Ym at each point are split into two orthogonal sets, 
of dimensions respectively N and M. Label the N tetrad axes y, of the first set with late letters z, and the 
M tetrad axes Ya of the second set with early letters a, and let mid letters kl... run over all indices. The 
orthogonality of the tetrad axes from opposite sets is expressed by the MN conditions 


Ya: Vz =0. (17.190) 
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In the M+WN split, the two orthogonal subspaces at each point are fixed a priori, which amounts to making 
a specific choice of gauge of the tetrad. The gauge-fixing fixes the two subspaces, but allows tetrad transfor- 
mations within each subspace. Under this restricted group of tetrad transformations, the tetrad connections 
Tazm With first two indices az from opposite subspaces form a tensor, the generalized extrinsic curvature 
Kazm; 


Kazm = Tazm = Ya" OmYz : (17.191) 


These connections form a tensor under the restricted group because the only potentially non-tensorial con- 
tribution to Ya ` OmYz under a restricted tetrad transformation y; > L,” Yy is 


Ya ` Vy mL” =0 ; (17.192) 


which vanishes because Ya and Yyy are orthogonal. There are MN(M + N) non-vanishing components of 
the extrinsic curvature Kazm (hence 12 if M = 3 and N = 1, or 16 if M = N = 2). The remaining tetrad 
connections Imni, namely those with first two indices mn from the same subspace, constitute the restricted 


connections Imni, 


Imni = Imni for mn = yz or mn =ab. (17.193) 


The vanishing of the mixed components Yaz of the tetrad metric implies that the generalized extrinsic 
curvature is antisymmetric in its first two indices, 


Kzal È —Kaz : (17.194) 
The vanishing components of Kmnı and ie are 


Kar = Ky =0, Lea 0. (17.195) 


17.10.2 M+N Riemann and Ricci tensors 
The extrinsic curvature Kmnı is a tensor under the restricted group of tetrad transformations. The restricted 
Riemann curvature tensor Rķklaz with its last two indices from opposite subspaces vanishes since 4, vanishes, 


piaz = kazi — Ol ye + ÊP Pee Tg + (TR OTe ei 0 (17.196) 


If torsion vanishes, then the full Riemann curvature tensor Rkimn is symmetric in kl + mn, but the restricted 
Riemann tensor Rkimn is not symmetric. Thus the components Razķı of the restricted Riemann curvature 
do not vanish even though the components Rklaz do vanish. 
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In the M+N split, the expression (17.34) for the Riemann curvature tensor becomes 


Fae = Êwsyz + Kp Kazu — BG Kanes (17.197a 
Reyaz = DrKazy Dy Knee + (Kgy KG) Bake (17.197b 
= Razay = Rigg + K, Keya — Ke, Keyz , (17.197c 
Riyaz = DoKazy Dy Kazo + Kg Kaze Baas (17.197d 
Rican = Dp Kase — DeKas + (Kẹ, — KS) Kaza (17.197e 
= Razve = Razoe + K£ Koca — Kg, Kscz , (17.197£ 
Rated = Ravea + K2,Kzaa — KZ, Kzab - (17.197¢ 


If the tetrad connections are replaced by their torsion-free expressions in terms of derivatives of the vierbein, 
then the various alternative expressions for the Riemann tensor become identities. The Ricci tensor Rkm is 


Ryz = Ry + (Da + Ka) K$, — D,K: — K? KS, , (17.198a 
Rza = Reva? + (Dy + Ky) KY, — Dz Ka — K! K}, (17.198b 
= Raz = Raye” + (Dy + Ky) Ko, — Da K- — KY,K?, (17.198¢ 
= D, KY, — D,K, + DK’, — Da K, — 2K%,K?, + K3, K? + KEKE, (17.198d 

Ras = Ray + (Dz + Kz) Ki, — Da Ko — KZ Ky . (17.1986 


Contracting the Ricci tensor yields the Ricci scalar R, 


R = R — 2D, K? — 2D, hea Kazo — he EE. (17.199 


17.11 2+2 split 


For the particular case of a 2+2 split, equations (17.198) for the Ricci tensor Rym become 


Rou = Rou — D, Ku + (Da a (17.200a 
Row = — D,K, + (Da + Ka) KS, HK, (17.200b 
Roy = Rop- + DoKopu — Du Kupu + Ky KY, — KY,K?, (17.200c 
= Ray =— Revou — De Kyo- + D_K 4,4 + Ky Kb, — KK, (17.200d 
= Dy Kopu — DuKoso — Dy Kyy— + D_Kyu4 — 2K! K}, + KY, KE, + KEKE, (17.2000 

R44 = (Dz + K.) Ki, — Dy Ky — Ki, KY, , (17.200f 
R4- = R4- + (Dz + K,)K2, — Dy K- — Ki, K", . (17.2008 
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Singularity theorems 


Singularity theorems prove that, given a number of plausible assumptions, general relativity commits suicide 
inside black holes. The conclusion that there are places, called singularities, inside black holes where the 
general relativistic description of spacetime fails is profound. It means that new physics, presumably quantum 
gravity in some form, must replace general relativity at singularities. Any viable theory of quantum gravity 
must be able to resolve the problem of singularities. 

The first singularity theorem was proved by Penrose (1965). The classic book by Hawking and Ellis (1973) 
lays out a variety of singularity theorems. As reviewed by Senovilla (1998), singularity theorems state that 
given: 

1. a trapped surface condition, 

2. a positive energy condition, 

3. a causality condition, 
then there exist geodesics that are incomplete, in the sense that the geodesics reach a point beyond which 
they cannot be continued. The power of singularity theorems is that they show that general relativity fails 
inside black holes. The weakness of singularity theorems is that they are quite unspecific about the nature 
or location of a “singularity.” 

This Chapter focuses on the principal ingredients of the singularity theorems, namely the Raychaudhuri 
equations, §18.2, and the construction of hypersurface-orthogonal congruences of geodesics, §§18.6 and 18.7. 
The Chapter concludes, §18.9, with a brief exposition of the original singularity theorem discovered by 
Penrose (1965). 


18.1 Congruences 


The Raychaudhuri equations govern the evolution of the extrinsic curvature along systems of paths called 
congruences, which fill, and do not cross or overlap in, at least some connected region of spacetime. 
Congruences may be timelike or null, and they may be geodesic or otherwise. Congruences are often defined 
with the restriction that the paths do not cross or overlap anywhere in spacetime, but in this book the more 
relaxed condition is imposed, that paths do not cross or overlap over some connected region. 
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A path is specified by its coordinates z” (A) as a function of some parameter À along the path. The 
derivative of the path defines the 4-velocity u along the path, 


_ da” 


ut = —. 18.1 
If the congruence of paths is timelike, then the parameter \ may be taken equal to the proper time 7 along 
the path. The 4-velocity u” = dz” /dr then satisfies the normalization condition upu” = —1. The 4-velocity 
vector u = e,,u" defines the tetrad time vector Yo, 
yo=u. (18.2) 


The tetrad time vector yo is the unique future-pointing vector that is tangent to the timelike path and 
normalized to Yo: Yo = —1. 

If the congruence of paths is null, then À may be any arbitrary parameter, not necessarily an affine 
parameter. If the parameter A is an affine parameter, then the path is said to be affinely parameterized. The 
4-velocity u” = dz” /dX satisfies the normalization condition u,,u“ = 0 regardless of whether the parameter 
A is affine. The 4-velocity vector u = e u” defines the tetrad null vector Yy (say), 


Yau. (18.3) 


Unlike the timelike case, the normalization condition Y `Y» = 0 does not determine uniquely the null vector 


Yv- 
For either a timelike or a null path, the 4-velocity u = um has tetrad-frame components 


u™ = {1,0,0,0} , (18.4) 


whose only non-vanishing component is u* = 1, with index z = 0 for a timelike path, z = v for a null path. 
The covariant derivative of the 4-velocity along the path is 


DpUm = OnUm — TË nUk = Fons (18.5) 


The components for spatial m = a constitute by definition the generalized extrinsic curvature Kazn, equa- 
tion (17.191), 


Data = Kazn . (18.6) 
The 4-velocity along the path evolves as 
Du" : 
DA = u” ðu” + Cae = r$, , (18.7) 
whose spatial components constitute the acceleration K$, 
Du" 
=k a 18.8 
DA ZZ ( ) 
For a timelike geodesic (z = 0), the time component of the acceleration vanishes automatically, Du? / DA = 
T8 = 0. For a null geodesic (z = v), the v-component of the acceleration Du? /DA =T®,, = —Puvy vanishes 


if the path is affinely parameterized, but not in general. If the null path is affinely parameterized, then the 
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4-velocity u™ coincides (up to a constant factor) with the momentum p™ along the path. Choosing the path 
to be affinely parameterized amounts to choosing the null vector y, such that the momentum p” is constant 
along the null geodesic, that is, a light ray is neither redshifted nor blueshifted as it propagates along the 
affinely parameterized path. 

The covariant divergence of the 4-velocity is 


Dmu” =T2,+ Kz. (18.9) 


For a timelike congruence, the covariant divergence is just the trace K = Ko = Kô, of the extrinsic curvature. 
For a null congruence, the covariant divergence is the acceleration TY, plus the trace K, = K3, of the extrinsic 
curvature. If the null path is affinely parameterized, then the covariant divergence is just the trace Ky. 


18.2 Raychaudhuri equations 


The Raychaudhuri equations, which in their most general form are equations (18.10), govern the evolution 
of the extrinsic curvature along arbitrary timelike or null congruences. Actually, the equation traditionally 
named after Raychaudhuri (1955) is the equation for the evolution of the trace of the extrinsic curvature. 
Here however the full suite of equations for the components of the extrinsic curvature are called Raychaudhuri 
equations. 

The Raychaudhuri equations come in various flavours, depending on whether the congruence is timelike or 
null, whether the congruence is geodesic, and what additional gauge conditions are imposed on the tetrad. 
If the congruence is timelike, it is convenient to take the tetrad to be orthonormal, with the time axis yo 
tangent to the timelike paths, equation (18.2). If the congruence is null, it is convenient to take the tetrad 
to be Newman-Penrose, that is, a double-null tetrad, with the null axis +, tangent to the null paths. To 
cover both timelike and null cases at the same time, denote the tangent axis by yz, with index z = 0 in the 
timelike case, and z = v in the null case. 

The Raychaudhuri equations are just a subset of the equations (17.197) for the Riemann tensor in an 
M+WN split of spacetime, §17.10. In 4 spacetime dimensions, the split is 3+1 for a timelike congruence, and 
2+2 for a null congruence. In an M+N split of spacetime, the Raychaudhuri equations are the equations for 
the components Ryzqz of the Riemann tensor, equation (17.197d), 


DzKazs — Dp Kaze — KY Kozy + K5,Koaze = —Rozaz (no sum over z) , (18.10) 


with z = 0 for a timelike congruence, or z = v for a null congruence. Equation (18.10) is to be interpreted 
as an equation governing the evolution of the extrinsic curvature Kaz» along any path of the congruence, 
that is, along the z-direction. The evolution depends on the Riemann curvature Rbzaz encountered along the 
path. 

The left hand side of equation (18.10) also depends on a derivative of the spatial acceleration Kazz. A 
necessary and sufficient condition for the congruence to be geodesic is that the spatial acceleration vanishes 


Kazz =0. (18.11) 
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For a geodesic congruence (not necessarily affinely parameterized), the Raychaudhuri equation (18.10) be- 
comes 


D, Kaz = K}? Kazy + KipKazc = —Rbzaz (no sum over 2) . (18.12) 


If the congruence is geodesic, then the tetrad can be chosen to to be parallel-transported along each path 
of the congruence. In this case all the components of the tetrad-frame connection with final index z vanish 


Tez =0. (18.13) 


The conditions (18.13) exhaust all the 6 degrees of freedom of Lorentz transformations of the tetrad. In this 
case the restricted covariant derivative D, in the Raychaudhuri equation (18.12) reduces to the directed 
derivative 0,, and the equation becomes 


-Kazo — KP Kary + KijKaze = —Rozaz (no sum over z) . (18.14) 


18.3 Raychaudhuri equations for a timelike geodesic congruence 


For a congruence of timelike paths, the extrinsic curvature is the spatial tensor Kab = Kaob = Laow. If the 
timelike paths are geodesic, then the acceleration Ka = Kaoo vanishes. Along a timelike geodesic congruence, 
the Raychaudhuri equations (18.12) become 


Do Kas + K°yKac = —Rooa0 - (18.15) 


In 4-dimensional spacetime, the 9 components of the extrinsic curvature Ka» are commonly resolved into 
an expansion scalar J, a 3-component antisymmetric vorticity tensor @,,, and a 5-component traceless 
symmetric shear tensor Cab, 


Kab = bav9 + Wab + Cab - (18.16) 


Like the extrinsic curvature, the expansion, vorticity, and shear are restricted tensors, that is, tensors with 
respect to the restricted group of spatial Lorentz transformations. The trace of the extrinsic curvature is three 
times the expansion, K = K$? = 3%. The vorticity is sometimes referred to alternatively as the rotation, or 
the twist. If desired, the vorticity can be written Wap = EabeW". 

If one imagines comoving coordinates attached to the congruence of paths, then the extrinsic curvature 
describes the rate at which the comoving volume element distorts, equation (18.5). The expansion VU equals 
one third the logarithmic rate of change of the volume of the comoving volume element, the vorticity is 
the rate at which the comoving volume element rotates (see §18.6), and the shear is the rate at which the 
comoving volume element distorts tidally. 

To see that the expansion measures the logarithmic rate of change of the volume, choose comoving coor- 
dinates consisting of the proper time 7 along with 3 spatial coordinates x“ that remain constant along the 
geodesics of the congruence. The comoving coordinate 4-velocity along geodesics is u” = {1,0,0,0}. The 
inverse vierbein satisfies eo” = u” = {1,0,0,0}, so the determinant e of the full vierbein reduces to the 
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determinant of its spatial part, e = |e’”,,| = |e*.|. The trace K = K§, = Ko equals the covariant divergence 
Dmu”, equation (18.9). The expansion V thus satisfies 


i 
oko Dau = POWI ap VT Oe (18.17) 

vg Ox” Ox! dT 
where \/g = e is the square root of the determinant of the spatial metric of the comoving line-element, which 
is the same as the determinant e of the vierbein. 

In the ADM formalism, the tetrad time vector Yo is chosen to be orthogonal to hypersurfaces of constant 
time t. If yo is so chosen, and if torsion vanishes as general relativity assumes, then vorticity Wap vanishes, 
as shown in §18.6, equation (18.38). This explains why in the ADM formalism the extrinsic curvature Kay 
is symmetric in ab. The paths of an ADM congruence are vorticity-free, but not necessarily geodesic. They 
are geodesic if and only if the lapse a is constant, equation (18.38). In the ADM formalism, the expansion 
satisfies equation (17.60), which reduces to equation (18.17) if the lapse is unity and the shift vanishes, that 
is, if the spatial coordinates are comoving and the time coordinate t is the proper time T. 

Not all congruences are hypersurface-orthogonal, so vorticity does not vanish in general. For example, if 
a congruence is chosen to follow the worldlines of a system of dust particles (dust particles being neutral 
and collisionless, to ensure that they follow geodesics), then the vorticity, which is related to the angular 
momentum of the system of particles, will generically be non-zero. 

The vorticity wab = Kjqp), the antisymmetric part of the extrinsic curvature Pao, should be distinguished 
from the precession I'j,,j)9 (if the tetrad metric Yap is constant, as here, then Taso is automatically anti- 
symmetric in ab; in the more general case where the tetrad metric is non-constant, as in ADM, §17.2.1, the 
precession equals the antisymmetric part of labo). The condition for the tetrad frame to be locally inertial, 
that is, freely falling and non-rotating, is that the acceleration and precession vanish, Pao0 = T\asjo = 0. 
By a suitable spatial rotation of the tetrad (which rotates the spatial axes ya while leaving the time axis 
Yo unchanged) the precession Tjaġjo can be arranged to vanish along a congruence. Whereas the precession 
describes the spatial rotation of the tetrad frame with respect to locally inertial, the vorticity is related to 
the angular momentum of particles following the congruence. Since the extrinsic curvature is a spatial tensor, 
if the vorticity vanishes in one frame, then it vanishes in any spatially rotated frame; and conversely if the 
vorticity is non-vanishing in one frame, then it is non-vanishing in any spatially rotated frame. 

The Raychaudhuri equations (18.15) for the expansion, vorticity, and shear along a timelike geodesic 
congruence are 


Dod +0? + 50” ob — 50 wap = —3Roo ; (18.18a) 
(Do +20) map +0 acb — O bWca = 0, (18.18b) 
(Do + 20)oab + (aga = Favo Oca) = (w°aWen = tasm Wea) = —Coaob ; (18.18c) 


where Ckimn is the Weyl tensor, the traceless part of the Riemann tensor. The restricted derivatives in 
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equations (18.18) are 


Dod = Od , (18.19a) 
Doar = OoMar — ooMen — Vig Wac ; (18.19b) 
Dooas = oC av — Vinten — VeoGac - (18.19c) 


If the tetrad is chosen to be parallel-transported along the geodesic, then all 6 of the tetrad connections 
with final index 0 vanish, 


Tei =0, (18.20) 


including not only the 3 components Ka = Kaoo of the acceleration, but also the 3 components Tabo of the 
precession. In this case, the restricted covariant time derivative simplifies to the directed time derivative, 
which is the same as the proper time derivative d/dr in the parallel-transported frame, 


R d 
Do = ==. (18.21) 


Exercise 18.1. Raychaudhuri equations for a non-geodesic timelike congruence. Derive the Ray- 
chaudhuri equations for a timelike congruence that is not geodesic. 
Solution. The Raychaudhuri equations for a timelike congruence including non-vanishing acceleration Ka 


are 
Dyer + 40" oan — wwa — iD°K, — IK"K, = —4 Roo : (18.22a) 
(Do + 28) map + 0°aWen — O°4Wea + 4(DaK, — Dy Ka) =0, (18.22b) 
(Do + 29) oa» + 0° atch — wawe — 4(DaKy + Dp Ka) — KaKo 
= $5ab(O ea — wwa — D'K, — K°K.) = —Coaob , (18.22c) 


with the restricted covariant derivatives given by equations (18.19). If the acceleration is the gradient of a 
potential, Ka = ôa lna, and if torsion vanishes as general relativity assumes, then D,Ky — D, Ka = 0, and 
vorticity vanishes if it vanishes initially. This is the situation imposed in the ADM formalism. If on the other 
hand the acceleration takes a more general form, then vorticity may be generated along the path. 


18.4 Raychaudhuri equations for a null geodesic congruence 


For a null congruence in 4-dimensional spacetime, it is convenient to work with a Newman-Penrose double- 
null tetrad {Yv,Yu,;¥+;7—-}, with two null directions at each point, an “outgoing” direction Y», and an 
‘“ingoing” direction Yu. The spin axes y+ and y- span the two-dimensional spatial plane orthogonal to the 
null directions. Late latin indices z,y,... run over null indices v, u, early latin indices a, b,... run over spin 
indices +, —, and mid latin indices k,1,... run over all four indices. 
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The extrinsic curvature constitutes the components Kazk = [azk of the tetrad-frame connections with 
first two indices az from opposite subspaces, equation (17.191). If the null congruence along the outgoing 
v-direction is geodesic, then the acceleration Kay, vanishes. Along outgoing null geodesics, the Raychaudhuri 
equations (18.12) are 


D, Kavo T Ky Kave = —Rovav : (18.23) 


The condition Kg yy = 0 that the outgoing null directions of the congruence be geodesic fixes 2 of the 
6 degrees of freedom of Lorentz transformations of the tetrad. Additional convenient gauge choices can be 
imposed. A common choice is to impose sufficient conditions that the restricted covariant derivative D, in 
the Raychaudhuri equation (18.23) reduces to the directed derivative 0,. This requires that the null axis 
‘v and the 2 spatial axes y+ (but not the null axis yu) of the tetrad be parallel-transported along the null 


geodesic congruence. Parallel-transport of yẹ and y+ amounts to imposing that 4 of the 6 tetrad connections 


vanish, 


Duov =T4-9 = Kiew =K yy =O. (18.24) 


The condition Ty), = 0 is the condition that the geodesics along y, be affinely parameterized, while the 
condition [,_, = 0 is the condition that the spatial axes y+ do not rotate in the parallel-transported frame. 
Under the conditions (18.24), the restricted covariant derivative in the Raychaudhuri equation (18.23) equals 
a derivative with respect to an affine parameter A along the null geodesic, 

a dx” O d 


Other gauge choices can be made. A natural choice is to choose the tetrad so that both outgoing and ingoing 


null directions are geodesic. For example, the principal null directions of an ideal black hole are geodesic (the 
tetrad that aligns with the principal null directions is the Boyer-Lindquist tetrad). The condition that the 
outgoing and ingoing null directions be geodesic translates into the condition that Kazz = 0, or explicitly 
the 4 conditions 


Kio = Kw = Kyuu = Kuu = 0. (18.26) 


If the ingoing null direction is geodesic, then the Raychaudhuri equations along the ingoing null geodesic 
are the same as equations (18.23) with null indices swapped, v + u. By a suitable Lorentz boost in the 
Yv—Yu plane, it is always possible to arrange that the tetrad frame is affinely parameterized in either the y» 
or the +, direction (that is, either Ty.» or [uu vanishes), but in general it is not possible to arrange that 
both null directions are affinely parameterized. Similarly, by a suitable spatial rotation in the y,-7y_ plane, 
it is always possible to arrange that the spatial axes are parallel-transported along either the -y, or the Yu 
direction (that is, either 'y_, or '4_, vanishes), but in general it is not possible to arrange that the spatial 
axes are parallel-transported along both null directions. 

The Raychaudhuri equations (18.23) are equations governing the evolution of the extrinsic curvatures 
Kavp with middle index the null direction v, and outer indices ab spin indices. Analogously to the 3+1 
decomposition (18.16), these 4 components are commonly decomposed into an expansion scalar J, an 
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antisymmetric vorticity tensor Wap = EabWw, and a traceless symmetric shear tensor Cab, 
Kavi = Yab + EabW + Cab - (18.27) 


Like the extrinsic curvature, the expansion, vorticity, and shear are restricted tensors. As usual in the 
Newman-Penrose formalism, complex conjugation flips the spin indices on any tensor, + + —, a consequence 
of the fact that the Newman-Penrose spin axes y+ and -y_ are complex conjugates of each other. The totally 
antisymmetric tensor E£ab in 2-dimensional spin space flips sign under complex conjugation, so is purely 
imaginary, €+- = i. The expansion and vorticity scalars J and w are both real. The shear is complex, with 
two components that are complex conjugates of each other, g__ = ož- 

Just as the timelike expansion equals one third the logarithmic rate of change of the comoving volume 
element along a timelike congruence, equation (18.17), so also the null expansion equals one half the log- 
arithmic rate of change of the comoving area element along a null congruence. First, notice that along an 
outgoing null congruence, the ingoing Yu component of the tetrad-frame covariant divergence D,,u™ van- 
ishes, Dyu” = 0,u" +r% uu” = —Tovu = 0 (no sum over u or v). Therefore the covariant divergence equals 
the tetrad-frame covariant divergence restricted to the 3-dimensional hypersurface spanned by the outgoing 
geodesic direction y, and the spatial directions y+. Such a 3-dimensional hypersurface can be constructed 


by starting with any spatial 2-surface and projecting “outgoing” null geodesics not necessarily orthogonally 
from it. Choose comoving coordinates along the null hypersurface consisting of the affine parameter along 
with 2 spatial coordinates z“ that remain constant along the geodesics of the congruence. The coordinate 
3-velocity within the null hypersurface is u = dx” /dà = {1,0,0}. Then analogously to equation (18.17) 
the null expansion satisfies, from equation (18.9) with I’, = 0 because the congruence is being taken to be 
affinely parameterized, 


E Dee E a ey L a 


© yg x” Oxk dN’ (aan) 


where g is the determinant of 2-dimensional spatial metric of the comoving line-element. Thus the null 
expansion J equals one half the logarithmic rate of change of the cross-sectional area of the comoving area 
element. 

In terms of the expansion, vorticity, and shear, the Raychaudhuri equations (18.23) along the outgoing 
null geodesic direction v are 


(Dy +9)9 — w? +0440, = —AnT yy , (18.29a) 
(D, + 28)a =0, (18.29b) 
(D, +20)o}4 = -Copot : (18.29¢) 


The restricted covariant derivatives in equations (18.29) are 


D 0 = (Op +Tuvv)9 , (18.30a) 
D,w = (3s +Tuyv)@ , (18.30b) 
Dyes. = (Ov E Ton + 2D 4-y)o44 . (18.30c) 
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expansion 8 vorticity © shear o 


Figure 18.1 Illustrating how the Sachs optical coefficients, the expansion J, the vorticity w, and the shear ø, char- 
acterize the rate at which a congruence of light rays changes shape as it propagates. The congruence of light rays is 
coming vertically upward out of the paper. 


18.5 Sachs optical coefficients 


If the null axis y, and the two spatial axes y+ are taken to be parallel-transported along the null geodesic 


directions y, of the congruence, then the tetrad connections Puw and [_+, in equations (18.29) vanish. 
In this case the expansion ¥, vorticity w, and the complex shear o = o4} are commonly called the Sachs 
optical coefficients (Sachs, 1961), often referred to as Sachs scalars. The Raychaudhuri equations (18.29) 
simplify to 


(Oy + V) — D? +oa0* = —AnT yy , (18.31a) 
(0, +20)a =0, (18.31b) 
(Ou + 20) = —Cv+v+ . (18.31c) 


The directed derivative 0, equals a derivative d/dX with respect to an affine parameter along the geodesic 
directions, equation (18.25). 

The Sachs coefficients characterize how the shape of the congruence of light rays evolves as it propagates, 
as illustrated in Figure 18.1. The expansion represents how fast the congruence expands, the vorticity how 
fast it rotates, and the shear how fast its ellipticity is changing. The amplitude and phase of the complex 
shear represent the amplitude and phase of the major axis of the shear ellipse. 


Concept question 18.2. Can vorticity be non-zero while shear vanishes? Answer. Yes. The princi- 
pal null congruences of the A-Kerr-Newman geometry provide an example of congruences that have non-zero 
vorticity but are shear-free, Exercise 23.11. 


18.6 Hypersurface-orthogonality for a timelike congruence 


Singularity theorems consider special congruences that are both geodesic and vorticity-free. The Raychaud- 
huri equation (18.18b) guarantees that if the vorticity Wwa» vanishes on the initial 3-dimensional hypersurface 
of a timelike geodesic congruence, then the vorticity will vanish identically everywhere along the congruence. 
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This section shows that a timelike congruence is geodesic and vorticity-free if and only if it is hypersurface- 
orthogonal, that is, the 4-velocity u along the paths is normal to some hypersurface, equations (18.33), 
which proves to be a hypersurface of constant proper time, or equivalently of constant action. The next 
subsection, §18.6.1, shows how to construct a timelike hypersurface-orthogonal congruence. 

The covariant curl of the 4-velocity u = yo of a congruence of timelike paths is 


Danu=7"Ay" (mun — T% muk) = y™ Ag” Pnom = YLA Y? Ka — 777A y? Wab - (18.32) 


The covariant curl is a 6-component bivector whose 3 time-space parts are the acceleration Ka, and whose 
3 space-space parts are the vorticity wab- 

Equation (18.32) shows that the covariant curl D A u vanishes if and only if both the acceleration Ka and 
the vorticity Wap vanish. If the curl vanishes, and if torsion vanishes, then by Poincaré’s lemma the 4-velocity 
u is, at least locally, the gradient of a scalar 7, 


DAu=0 & u=-édr. (18.33) 


The scalar 7 is just the proper time along the geodesics, as follows from 


dx" Or 
. . = =Í. 18.34 
u: u =—u: rT dr Orr (18.34) 


Thus the 4-velocity u is normal to 3-dimensional hypersurfaces of constant proper time T. 


The action S of a freely-falling particle of non-zero mass m is related to the proper time along the particle’s 
worldline by, equation (4.7), 


S=-mr7. (18.35) 


Thus the hypersurfaces of a hypersurface-orthogonal timelike congruence are also hypersurfaces of constant 
action for massive, freely-falling particles. The covariant momentum p, = mu, of the particle is the gradient 
of the action, equation (4.105), 

Os 


a Oxe 


; (18.36) 


which reproduces the result u = — ôr. 
A weaker condition than the vanishing of D ^ u is that the curl D A(u/a) of the 4-velocity scaled by some 
arbitrary factor a vanishes. The covariant curl of the scaled 4-velocity u/a is 


aDA(u/a) = DAuturAdlna= 7° AY (Ka — ôa lna) — Y° AY Wa - (18.37) 

This curl of the scaled 4-velocity vanishes if and only if the acceleration Ka is the gradient of a scalar, and 
the vorticity Wap vanishes, 

K,=0,lna, way=O0. (18.38) 

The conditions (18.38) are precisely those established in the ADM formalism, with a being the lapse. If 


conditions (18.38) hold, then D A(u/q) vanishes, and if torsion also vanishes, then by Poincaré’s lemma u/a 
is, at least locally, the gradient of a scalar t, 


DX(u/a)=0 & w=-adt. (18.39) 


542 Singularity theorems 


Figure 18.2 Spacetime diagram of Minkowski space illustrating a hypersurface-orthogonal congruence of timelike 
geodesics. The congruence is constructed by starting with an initial 3-dimensional spacelike hypersurface (thick like), 
here a cosine perturbation from the t = 0 hypersurface, and projecting geodesics (blue lines) along its timelike normal 
direction. Hypersurfaces of constant proper time r (purple lines) to the past or future of the initial hypersurface 
remain orthogonal to the geodesics. Generically, as here, the geodesics cross, and the spatial hypersurfaces of constant 
proper time correspondingly develop caustics where the hypersurfaces fold and crease. 


The scalar coordinate t is just the ADM time coordinate, as follows from 


u = yo = -Y = —e , e” =—-ae’ = —a e” zk aðt. (18.40) 


18.6.1 Construction of timelike, geodesic, hypersurface-orthogonal congruences 


It is straightforward to construct a timelike, geodesic, hypersurface-orthogonal congruence by starting with 
any 3-dimensional spacelike hypersurface and projecting geodesics into the past and future along the normal 
to the spacelike hypersurface, as illustrated in Figure 18.2. The geodesics are orthogonal to hypersurfaces 
of constant proper time 7, or equivalently of constant action S = —mr, starting at rT = 0 (or S = 0) on 
the initial spacelike hypersurface. Generically, the resulting geodesics will cross at some point in the past 
or future or both, and the hypersurface correspondingly develops caustics, as in Figure 18.2. Geodesics 
remain orthogonal to hypersurfaces of constant proper time 7 even after they cross, but the proper time T 
is multiply-valued at spacetime points crossed by multiple geodesics. 

Caustics in collisionless streams of stars are often observed in deep images of elliptical galaxies, as illustrated 
in Figure 18.3. When galaxies collide, the gravitational potentials of the galaxies merge, but because galaxies 
are mostly empty space, the stars in the galaxies do not collide. When a small galaxy with a small velocity 
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Figure 18.3 This deep image of the elliptical galaxy NGC 474 shows shells caused by caustics in collisionless streams 
of stars originating from small galaxies accreted by NGC 474 over the last billion years. Astronomy Picture of the 
Day, 2011 July 26. Image credit: P.-A. Duc (CEA, CFHT), Atlas 3D Collaboration. 


dispersion in its stars falls into a larger galaxy, the smaller galaxy is tidally disrupted by the larger galaxy, 
but the stars from the smaller galaxy continue to orbit the larger galaxy in coherent collisionless streams, 
forming caustics where the star streams turn around in the merged gravitational potential. 


18.7 Hypersurface-orthogonality for a null congruence 


For massive particles, the proper time 7, or equivalently the action S = —mr = —m?A, where A is the 
affine parameter, progresses along geodesics, and momenta along geodesics are orthogonal to hypersurfaces of 
constant action, equation (18.36). For massless particles on the other hand, the action does not progress along 
null geodesics. For a null congruence, it is not possible to start from an initial 3-dimensional hypersurface 
over which the action vanishes, and to project null geodesics into the past and future from this initial 
hypersurface, because the failure of the action to progress along null geodesics would then imply that the 
action would vanish everywhere, and the spacetime would cease to be foliated into hypersurfaces of constant 
action to which geodesics were putatively orthogonal. 

Rather, the action must be allowed to vary along the initial 3-dimensional hypersurface of a null congruence. 
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The action on the initial 3-dimensional hypersurface foliates it into 2-dimensional spatial surfaces of constant 
action. At each point on each 2-dimensional surface there are exactly 2 null directions orthogonal to the spatial 
2-surface, one “outgoing,” the other “ingoing.” Projecting null geodesics along these null directions defines 
a pair of 3-dimensional null hypersurfaces along which the action is constant. The result is a spacetime 
that is foliated into pairs of outgoing and ingoing 3-dimensional null hypersurfaces of constant outgoing (+) 
and ingoing (—) action S+. The values of the actions are determined by their values on the initial non-null 


3-dimensional hypersurface. 

Null congruences constructed in this way are said to be hypersurface-orthogonal. This definition of 
hypersurface-orthogonality for null congruences does not require that equation (18.36) holds across all of 
the 4-dimensional spacetime. Rather, hypersurface-orthogonality for null congruences imposes that equa- 
tion (18.36) holds in the massless limit along each 3-dimensional null hypersurface of constant action, 


Py = lim m*——. (18.41) 


To see why the definition of hypersurface-orthogonality for null congruences does not impose that the con- 
dition (18.36) hold over the entire 4-dimensional spacetime, suppose contrarily that it did. The 4-momentum 
along an outgoing null geodesic of the congruence satisfies p = p’y, = puy” (no sum over v or u). Poincaré’s 
lemma implies that equation (18.36) holds, at least locally, if and only if the covariant curl of the 4-momentum 
vanishes, D ^ p = 0. The covariant curl of the 4-momentum is, similarly to equation (18.32), 


D Ap =Y” AY (OmPn — Ten Pe) =p” Y” AY Tonm 5 (18.42) 


which vanishes if and only if Pyjzmj = 0. This is a set of 6 conditions on the tetrad connections, requiring 
not only that the 2 spatial components of the acceleration Kay, and the 1 component of vorticity K,j—+; = 
@4— = £€4-W vanish, but also that the 1 component of acceleration Puw along the null direction Yyy and 
the 2 components Pvjau] vanish. While the 6 Lorentz gauge freedoms allow these 6 tetrad-frame connections 
to be chosen to vanish along the outgoing congruence, the corresponding 6 connections along the ingoing 
congruence cannot be made to vanish at the same time. Moreover the Raychaudhuri equations (18.29) have 
no dependence on the 2 components Tsjau], and the vorticity equation (18.29b) allows vorticity to vanish 
without requiring that [,,,, vanishes. 

Thus hypersurface-orthogonality for null congruences is conventionally defined by the weaker condition 
that the limiting equation (18.41) hold along each 3-dimensional null hypersurface. This requires that only 
the components pA(D ^p) of the covariant curl tangent to each 3-dimensional null hypersurface vanish, 
not that the covariant curl vanish identically throughout spacetime. The components of the covariant curl 
restricted to the null hypersurface are 


p\(DAp) = —(p’)? ay" AN (y” AY Kavy — Y° Ay ab) : (18.43) 


The covariant curl (18.43) is a 3-component bivector whose time-space part is proportional to the spatial 
acceleration Kay ,, and whose space-space part is proportional to the vorticity wap = €ayw- Unlike the timelike 
case, equation (18.37), the hypersurface-orthogonality condition (18.43) for null congruences is unchanged 
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Figure 18.4 3D spacetime diagram of Minkowski space illustrating a pair of hypersurface-orthogonal congruences of 
null geodesics (blue lines) emerging from a 2-dimensional spacelike surface (thick line). The spacelike curves (purple 
lines) on the two null hypersurfaces are lines of constant affine parameter A. These lines of constant affine parameter 
trace the intersections of null hypersurfaces in the remaining spacetime provided that the congruences are constructed 
to have translation symmetry in the y-direction (and in the suppressed z-direction), in which case other null hyper- 
surfaces are parallel to the two shown, translated in the y-direction. This Figure would look the same as Figure 18.2 
if projected on to the t-z plane. 


by scaling the momentum p by some arbitrary factor a, since 


ap\(DA(p/a))=pA(DAp)+pApAOdlna=pA(DAp), (18.44) 


because p/ p = 0. 

The Raychaudhuri equation (18.29b) for the vorticity w along an outgoing geodesic of a null congruence 
implies that if the vorticity vanishes on the initial 2-dimensional spatial hypersurface spanned by y+, then it 
is guaranteed to vanish thereafter. Thus a null geodesic congruence that is initially hypersurface-orthogonal 
will remain hypersurface-orthogonal thereafter. Note that equation (18.29b) allows the vorticity to vanish 
identically without imposing that the geodesic be affinely parameterized, that is, without imposing that Dy» 


vanishes. 

Hypersurface-orthogonality along the outgoing null congruence imposes only 3 conditions on the tetrad, 
namely that the outgoing spatial acceleration Kay, and the outgoing vorticity w+- = K,j_+4) vanish. The 6 
Lorentz gauge freedoms allow hypersurface-orthogonality to be imposed simultaneously along both outgoing 
and ingoing null congruences, by demanding that the spatial accelerations Kazz and the vorticities K,;4_} 
along both congruences vanish. 
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18.7.1 Construction of double-null, geodesic, hypersurface-orthogonal congruences 


To construct hypersurface-orthogonal congruences of outgoing and ingoing null geodesics, start with any non- 
null (timelike or spacelike) 3-dimensional hypersurface. Foliate the hypersurface into 2-dimensional spatial 
surfaces labelled by a time coordinate or a spatial coordinate according to whether the parent 3-hypersurface 
is timelike or spacelike. Project null geodesics along the two null directions normal to each spatial 2-surface. 
The null geodesics projecting from each 2-surface form a pair of 3-dimensional null hypersurfaces, as illus- 
trated by Figure 18.4. Each null hypersurface is labelled by a constant null coordinate whose value is set by 
the value of the time or spatial coordinate on the 2-surface. The two geodesic null directions at each point 
define the null directions +, and Yy, of a Newman-Penrose tetrad. The spatial directions orthogonal to the 
two null directions define a plane whose tangent directions form the spatial directions y} and -y_ of the 
Newman-Penrose tetrad. 

Again it should be emphasized that hypersurface-orthogonality for null congruences is defined not by 
condition (18.36) imposed over all spacetime, but rather by the limiting condition (18.41) imposed over each 
of the 3-dimensional null hypersurfaces of the congruence. 


18.8 Focusing theorems 


Focusing theorems exist for both timelike and null congruences. The focusing theorem follows from the 
Raychaudhuri equation for the expansion Vv, coupled with assumptions about the sources in that equation. 
The assumptions are: 


1. the congruence is hypersurface-orthogonal; 
2. the expansion is negative at some point, V < 0; 


3. the energy-momentum tensor satisfies a positivity condition. 

As shown in §§18.6 and 18.7, a hypersurface-orthogonal timelike or null congruence can be constructed 
by starting from some arbitrary (spacelike, for a timelike congruence, or non-null, for a null congruence) 
initial 3-dimensional hypersurface and projecting geodesics orthogonally from it. The requirement that the 
expansion be negative at some point is the reason that singularity theorems posit that a trapped surface has 
formed. A trapped surface is defined to be a closed 2-dimensional surface from which the expansions along 
both outgoing and ingoing orthogonal null directions are negative everywhere along the surface. Trapped 
surfaces exist inside the outer horizon of an ideal black hole, and it is plausible that the formation of a 
trapped surface is characteristic of the formation of a black hole. The final condition, a positivity condition 
on the energy-momentum tensor, ensures that the energy-momentum source in the Raychaudhuri equation 
is positive. 
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18.8.1 Focusing theorem for a null geodesic congruence 


If the vorticity w vanishes, then in the frame parallel-transported along the null congruence, the Raychaud- 
huri equation (18.31a) for the expansion V along a null congruence simplifies to 


I + co* + EGyy =0. (18.45) 


' 2 


The terms J? and co* are necessarily positive. The Newman-Penrose component Gy, of the Einstein tensor 
is related to the components in the parent orthonormal tetrad by 


Guy = +Goo + Gog + $G33 i (18.46) 


The Einstein component Gey has boost weight 2, and is therefore multiplied by e?? under a boost by 
rapidity 0 in the 3-direction. Consequently positivity of Gy, in one frame implies positivity of Gy, in any 
frame boosted in the 3-direction. Boosted along the 3-direction into the centre-of-mass frame, where Go3 = 0, 
equation (18.46) reduces to 


Guy = 4 (Goo + G33) = 4 (p + ps) , (18.47) 


where p is the energy density and p3 the pressure along the 3-direction. The Einstein component G,,, is 
therefore positive provided that 


p+p3 20 (18.48) 


which is called the null energy condition. If the null energy condition (18.48) holds, then the vorticity-free 
Raychaudhuri equation (18.45) shows that the expansion V must always decrease. 
The Raychaudhuri equation (18.45) can be arranged as 


d(1/) oo* + 4Gw 
=1 18.4 
T F J2 i (18.49) 


whose right hand side is greater than or equal to 1, given the null energy condition (18.48). If the expansion 
V is negative (meaning that light rays are converging), then equation (18.49) shows that 1/9 will reach 0 at 
a finite value of the affine parameter à. In other words, ? must become negative infinite at some finite value 
of À. 

A negative infinite value of the expansion means that the cross-sectional area of the null congruence has 
shrunk to zero. This does not mean that a singularity has formed; it means simply that geodesics have reached 
a crossing point. For example, Figure 18.4 shows crossing geodesics of a null congruence in Minkowski space. 
It is only when all geodesics from a hypersurface-orthogonal congruence reach a crossing point that the 
spacetime encounters difficulties. In Figure 18.4, while the expansion is negative along some null geodesics, 
it is positive along others. 
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Figure 18.5 Spacetime diagram illustrating the dog-leg proposition. The dog-leg proposition asserts that any dog-leg 
path that joins 2 events A and B by a consecutive pair of null or timelike geodesics can be deformed into a strictly 
timelike path of longer proper time between A and B. The proposition is an assertion about the global causal structure 
of spacetime. 


18.8.2 Focusing theorem for timelike geodesic congruence 


The proof of the focusing theorem for timelike geodesics is similar to that for null geodesics. For vanishing 
vorticity, the Raychaudhuri equation (18.18a) along a timelike geodesic congruence is 


dd 
T + yg? + 40 oan + + Roo =0 (18.50) 


in the orthonormal tetrad frame freely-falling along the geodesic. The component Roo of the Ricci tensor in 
the orthonormal tetrad is 


Roo = 4r(p + 3p) P (18.51) 


where p is the energy density and p = ipa is the isotropic pressure. The Ricci component Roo is positive 
provided that 


p+3p>0 (18.52) 


which is called the strong energy condition. Note that a cosmological constant violates the strong energy 
condition (18.52), but not the null energy condition (18.48). 


18.9 Singularity theorems 


This section gives an account of one version of the singularity theorems, the original null version proved by 
Penrose (1965). See Senovilla (1998) for a review of singularity theorems. 
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Figure 18.6 Null boundary of the future of a 2-dimensional spacelike surface. The null boundary is a pair of 3- 
dimensional null surfaces projecting orthogonally from the 2-surface (thick line), with the parts of the hypersurfaces 
excised after geodesic crossing, since the latter parts are connected by timelike geodesics to the 2-surface and are 
therefore not part of the null boundary. This is the same as Figure 18.4, but with geodesics terminated where they 
cross. 


18.9.1 Dog-leg proposition 


A building block of singularity theorems is the dog-leg proposition. The dog-leg proposition asserts that 
any dog-leg path between two events A and B that consists of two different timelike or null geodesics joined 
together can be deformed into a strictly timelike path of longer proper time between A and B, as illustrated in 
Figure 18.5. The dog-leg proposition is a statement about the global causal structure of spacetime. The dog- 
leg proposition does not hold inside the inner horizon of a Kerr-Newman black hole, Concept question 18.3. 

The dog-leg proposition can be replaced by other plausible hypotheses. Much of the content of the book 
by Hawking and Ellis (1973) is concerned with exploring different plausible causality conditions. However, 
that will not be done here. 


18.9.2 Null singularity theorem 


Start with any 2-dimensional spatial surface. The future of this 2-surface is the 4-dimensional region of 
spacetime comprising all events that can be reached by some non-spacelike future-pointing path that starts 
at some point on the 2-surface. In a local neighbourhood of the 2-surface, the boundary of the future of the 
2-surface comprises the pair of 3-dimensional null hypersurfaces projected orthogonally from the 2-surface, 
as illustrated by Figure 18.4. The dog-leg proposition then implies that the future boundary is formed only 
from orthogonally-projected null geodesics. However, orthogonally-projected geodesics can intersect, as in 
Figure 18.4. After two orthogonally-projected geodesics intersect, a point to the future of the intersection can 
be reached from the 2-surface by starting on one geodesic and switching to the other at the crossing point. 
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This dog-leg null path can be deformed into a timelike curve, and is therefore also not part of the future 
null boundary. Therefore the boundary of the future of the 2-surface comprises the pair of 3-dimensional 
null hypersurfaces projected orthogonally from the 2-surface, truncated where the null geodesics cross, as 
illustrated by Figure 18.6. 

Now assume that the 2-dimensional surface is a trapped surface, meaning that the expansion along both 
the outgoing and ingoing null geodesic directions projected orthogonally from the 2-surface is negative at 
every point of the 2-surface. The focusing theorem implies that the expansion along every such null geodesic 
reaches negative infinity at a finite value of the affine parameter åA, indicating that neighbouring null geodesics 
are crossing. Points on a null geodesic to the future of a crossing are no longer on the boundary of the future. 
Therefore the 3-dimensional boundary of the future of the trapped surface terminates after a finite affine 
parameter at a 2-dimensional caustic boundary. This is a contradiction, since the boundary of a boundary 
of a manifold is empty. Therefore the future must terminate, as it does for example inside the horizon of the 
Schwarzschild geometry. 


Concept question 18.3. How do singularity theorems apply to the Kerr geometry? Answer. 
The Kerr geometry violates the deg-leg proposition, so for this geometry the future does not terminate, but 
rather continues beyond the region where any trapped surface reaches a caustic boundary (see §23.24.1). As 
found in Exercise ??, the only geodesics that reach the ring singularity (Singularity or Parallel Singularity) 
of a Kerr black hole with a 4 0 are null geodesics that lie in the equatorial plane. Therefore, to reach the 
singularity from a non-equatorial point, it is necessary to follow a geodesic down to the equatorial plane 
and then dog-leg to the singularity. Such a path cannot be deformed to a timelike geodesic. Similarly, a 
geodesic that starts at the singularity is confined to the equatorial plane, and a dog-leg is required to get 
out of the plane. The region that can be reached from the singularity by a dog-legged geodesic is the region 
inside the inner horizon. The ingoing and outgoing inner horizons of a Kerr black hole form the boundary of 
predictability, also known as the Cauchy horizon. A similar argument applies to the Kerr-Newman geometry, 
except that geodesics that hit the singularity must not only be null and equatorial, but also on one of the 
ingoing or outgoing principal null congruences, Exercise ??. 


Concept question 18.4. How do singularity theorems apply to the Reissner-Nordstrém geome- 
try? In Reissner-Nordstrém, the only geodesics that hit the singularity are radial null geodesics, Exercise ??. 
The Reissner-Nordstrém violates the dog-leg proposition because a dog-leg path that connects to the singu- 
larity cannot be deformed into a strictly timelike path: any path that connects to the singularity must be 
null asymptotically near the singularity. 


Concept Questions 


. Explain how the equation for the Gullstrand-Painlevé metric (19.22) encodes not merely a metric but a 
full vierbein. 

. In what sense does the Gullstrand-Painlevé metric (19.22) depict a flow of space? [Are the coordinates 
moving? If not, then what is moving?| 

. If space has no substance, what does it mean that space falls into a black hole? 

. Would there be any gravitational field in a spacetime where space fell at constant velocity instead of 
accelerating? 

. In spherically symmetric spacetimes, what is the most important Einstein equation, the one that causes 
Reissner-Nordstrém black holes to be repulsive in their interiors, and causes mass inflation in non-empty 
(non Reissner-Nordstrém) charged black holes? 
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What’s important? 


1. The tetrad formalism provides a firm mathematical foundation for the concept that space falls faster 
than light inside a black hole. 

2. Whereas the Kerr-Newman geometry of an ideal rotating black hole contains inside its horizon wormhole 
and white hole connections to other universes, real black holes are subject to the mass inflation stability 
discovered by Eric Poisson & Werner Israel (Poisson and Israel, 1990). 
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Black hole waterfalls 


19.1 Tetrads move through coordinates 


As already discussed in §11.3, the way in which metrics are commonly written, as a (weighted) sum of squares 
of differentials, 


ds? = Ymn e™ „ e”, dx" dx” , (19.1) 


encodes not only a metric guv = Ymn ee", but also a vierbein e”,,, and consequently an inverse vierbein 
€m”, and associated tetrad Ym. Most commonly the tetrad metric is orthonormal (Minkowski), Ymn = mn, 
but other tetrad metrics, such as Newman-Penrose, occur. Usually it is self-evident from the form of the 
line-element what the tetrad metric Ymn is in any particular case. 

If the tetrad is orthonormal, Ymn = Nmn, then the 4-velocity u™ of an object at rest in the tetrad, or 
equivalently the 4-velocity of the tetrad rest frame itself, is 


u™ = {1,0,0,0} . (19.2) 
The tetrad-frame 4-velocity (19.2) of the tetrad rest frame is transformed to a coordinate-frame 4-velocity 
u” in the usual way, by applying the inverse vierbein, 
dz” _ 
dt 


= u” = en "u™ =e" . (19.3) 
Equation (19.3) says that the tetrad rest frame moves through the coordinates at coordinate 4-velocity given 


by the zeroth row of the inverse vierbein, dx” /dr = eo”. The coordinate 4-velocity u” is related to the lapse 
a and shift 6° in the ADM formalism by u” = {1, 8°}/a, equation (17.11). 

The idea that locally inertial frames move through the coordinates provides the simplest way to conceptu- 
alize black holes. The motion of locally inertial frames through coordinates is what is meant by the “dragging 
of inertial frames” around rotating masses. 


Exercise 19.1. Tetrad frame of a rotating wheel. Derive the line-element of Minkowski space adapted 
to the tetrad frame of a wheel uniformly rotating at angular velocity w. Show that a clock attached to the 
wheel ticks slow by the Lorentz factor y compared to a clock in the non-rotating frame, and that rulers 
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attached to the wheel measure the rim to be Lorentz-contracted by a factor y compared to the non-rotating 
frame. 
Solution. Start with the line-element of Minkowski space in cylindrical coordinates z” = {t,r, ¢, z}, 


d3? = — dt? + dr? + r7d¢? + dz”. (19.4) 


The vierbein for the line-element (19.4) is e”,, = diag(1,1,r,1), and the corresponding inverse vierbein is 
em” = diag(1,1,1/r,1). Lorentz boost the inverse vierbein into the tetrad frame of the wheel rotating at 
velocity v = rw in the azimuthal ¢ direction, 


y 0 yrw 0 1 0 0 0 y 0 w 0 
0 1 0 0 011 0 0 0 1 0 O 
u = = 1 - 
em yrw 0 y 0 0 0 1/r 0 yrw 0 yfr 0 (19.8) 
0 0 0 1 00 0 1 0 0 0 1 
The coordinate-frame 4-velocity of the wheel’s tetrad frame through the coordinates is 
drt! 
cl = u” = e” = {7, 0, yw, 0} , (19.6) 
dt 
confirming that indeed the wheel is moving at dé/dt = w. The line-element is 
ds? = —¥7(dt — r2w do)? + dr? + y?r? (dd — w dt)? + dz? . (19.7) 
A point on the wheel follows dr = dé — w dt = dz = 0, so its proper time satisfies 
dr = (dt — r°w do) = (1 — r?w*)dt = 2 ; (19.8) 
y 


demonstrating that a clock on the wheel runs slow by y as claimed. Rulers attached to the rim of the wheel 
measure distances that are simultaneous in the frame of the wheel, corresponding to dt — r?°w do = 0. Thus 
corotating rulers measure azimuthal distances along the rim of 


dl = yr(dọ — w dt) = yr(1 — r’°w’ do = — , (19.9) 


demonstrating that the rim is Lorentz-contracted by y as claimed. 


19.2 Gullstrand-Painlevé waterfall 


The Gullstrand-Painlevé metric is a version of the metric for a spherical (Schwarzschild or Reissner-Nordström) 
black hole discovered in 1921 independently by Allvar Gullstrand (Gullstrand, 1922) and Paul Painlevé 
(Painlevé, 1921). Although Gullstrand’s paper was published in 1922, after Painlevé’s, it appears that Gull- 
strand’s work has priority. Gullstrand’s paper was dated 25 May 1921, whereas Painlevé’s is a write up of a 
presentation to the Académie des Sciences in Paris on 24 October 1921. Moreover, Gullstrand seems to have 
had a better grasp of what he had discovered than Painlevé, for Gullstrand recognized that observables such 
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Figure 19.1 Radial velocity 6 in (upper panel) a Schwarzschild black hole, and (lower panel) a Reissner-Nordstrém 
black hole with electric charge Q = 0.96. 


as the redshift of light from the Sun are unaffected by the choice of coordinates in the Schwarzschild geom- 
etry, whereas Painlevé, noting that the spatial metric was flat at constant free-fall time, dtg = 0, concluded 
in his final sentence that, as regards the redshift of light and such, “c’est pure imagination de prétendre tirer 
du ds? des conséquences de cette nature.” 

Although neither Gullstrand nor Painlevé understood it, their metric paints a picture of space falling like 
a river, or waterfall, into a spherical black hole, Figure 6.1. The river has two key features: first, the river 
flows in Galilean fashion through a flat Galilean background, equation (19.25); and second, as a freely-falling 
fishy swims through the river, its 4-velocity, or more generally any 4-vector attached to it, evolves by a 
series of infinitesimal Lorentz boosts induced by the change in the velocity of the river from place to place, 
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equation (19.30). Because the river moves in Galilean fashion, it can, and inside the horizon does, move 
faster than light through the background coordinates. However, objects moving in the river move according 
to the rules of special relativity, and so cannot move faster than light through the river. 


19.2.1 Gullstrand-Painlevé tetrad 
The Gullstrand-Painlevé metric (7.27) is 


ds? = — dt2. + (dr — B dtg)? + r?(d0? + sin?6 dd?) , (19.10) 
where £ is defined to be the radial velocity of a person who free-falls radially from rest at infinity, 
dr dr 
=— = — 19.11 
H dr dt’ ( ) 


and tg is the free-fall time, the proper time experienced by a person who free-falls from rest at infinity. The 
radial velocity 6 is the (apparently) Newtonian escape velocity 
2M (r) 


pax a, (19.12) 


where M (r) is the interior mass within radius r, and the sign is — (infalling) for a black hole, + (outfalling) 
for a white hole. For the Schwarzschild or Reissner-Nordstrém geometry the interior mass M(r) is the mass 
M at infinity minus the mass Q?/2r in the electric field outside r, 

M(r)=M-—. (19.13) 


Figure 19.1 illustrates the velocity fields in Schwarzschild and Reissner-Nordstrém black holes. Horizons 
occur where the radial velocity 8 equals the speed of light 


pssi; (19.14) 


with — for black hole solutions, + for white hole solutions. The phenomenology of Schwarzschild and Reissner- 
Nordström black holes has already been explored in Chapters 7 and 8. 


Exercise 19.2. Coordinate transformation from Schwarzschild to Gullstrand-Painlevé. Show that 
the Schwarzschild metric transforms into the Gullstrand-Painlevé metric under the coordinate transformation 
of the time coordinate 


dtg = dt — 


B 
rope (19.15) 
Exercise 19.3. Velocity of a person who free-falls radially from rest. Confirm that 8 given by 
equation (21.36) is indeed the velocity (19.11) of a person who free-falls radially from rest at infinity in the 
Reissner-Nordstré6m geometry. 
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The Gullstrand-Painlevé line-element (19.10) encodes a vierbein with an orthonormal tetrad metric Ymn = 
Nmn through 


e dz” = dtg , (19.16a) 
e'„ dx” = dr — b dtg , (19.16b) 
e dr” = rdé , (19.16c) 
e, dz” =rsinOd¢ . (19.16d) 


Explicitly, the vierbein e”,, of the Gullstrand-Painlevé line-element (19.10), and the corresponding inverse 
vierbein em”, are the matrices 


100 0 18 0 0 

m | -810 0 a 01 0 0 

euT) 0 or o0 a em 0 0 1/r 0 aay) 
0 0 0 rsiné 0 0 0 1/(rsiné) 


According to equation (19.3), the coordinate 4-velocity of the tetrad frame through the coordinates is 


[ dr do ‘0 P 


PrP m i = eo” = {1,8,0,0}, (19.18) 
consistent with the claim (19.11) that 8 represents a radial velocity, while tg coincides with the proper time 
in the tetrad frame. 

The tetrad and coordinate axes ym and e, are related to each other by the vierbein in the usual way, 
Ym = em" e, and e, = e™ „Ym. The Gullstrand-Painlevé orthonormal tetrad axes Ym are thus related to 
the coordinate axes e,, by 


Yo = Ete + per > NFHEr, VS eọ/r >» B= eg/(r sin 8) s (19.19) 


Physically, the Gullstrand-Painlevé-Cartesian tetrad (19.19) are the axes of locally inertial orthonormal 
frames (with spatial axes Ya oriented in the polar directions r,6,¢) attached to observers who free-fall 
radially, without rotating, starting from zero velocity and zero angular momentum at infinity. The fact 
that the tetrad axes +, are parallel-transported, without precessing, along the worldlines of the radially 
free-falling observers can be confirmed by checking that the tetrad connections [m0 with final index 0 all 
vanish, which implies that 


dym 
a = Oo Ym = Ty 0%n =0. (19.20) 
= 


That the proper time derivative d/dr in equation (19.20) of a person at rest in the tetrad frame, with 
4-velocity (19.2), is equal to the directed time derivative o follows from 


d o 


cë 


dr Ixe =y” m = Oo . (19.21) 
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19.2.2 Gullstrand-Painlevé-Cartesian tetrad 


The manner in which the Gullstrand-Painlevé line-element depicts a flow of space into a black hole is eluci- 
dated further if the line-element is written in Cartesian rather than spherical polar coordinates. Introduce a 
Cartesian coordinate system z” = {tg, x°} = {tg, x,y,z}. The Gullstrand-Painlevé metric in these Cartesian 
coordinates is 


ds? = — dt? + bag(dx® — B°dtg)(dx® — B’ dta) | , (19.22) 


with implicit summation over spatial indices a, 3 = x,y, z. The 6% in the metric (19.22) are the components 
of the radial velocity expressed in Cartesian coordinates 


pr =p{=, 2, =). (19.23) 
ror 
The vierbein e”,, and inverse vierbein em” encoded in the Gullstrand-Painlevé-Cartesian line-element (19.22) 
are 
1 0 0 0 1 B® BY p? 
-B' 1 0 0 j 0 1 0 0 
ai m = 19.24 
oe -82 01047 f 00 1 0 yee) 
-6 0 0 1 0 0 0 1 


The tetrad axes Ym of the Gullstrand-Painlevé-Cartesian line-element (19.22) are related to the coordinate 
tangent axes e,, by 


Yo = Ete + Bren , Ya = ee ) (19.25) 


and conversely the coordinate tangent axes e, are related to the tetrad axes Ym by 
Ete = Yo — Bova > CaF OO Ya : (19.26) 


Note that the tetrad-frame contravariant components 3° of the radial velocity coincide with the coordinate- 
frame contravariant components 6°; for clarification of this point see the more general equation (19.54) 
for a rotating black hole. The Gullstrand-Painlevé-Cartesian tetrad axes (19.25) are the same as the tetrad 
axes (19.19), but rotated to point in Cartesian directions x, y, z rather than in polar directions r, 0, ¢. Like the 
polar tetrad, the Cartesian tetrad axes Ym are parallel-transported, without precessing, along the worldlines 
of radially free-falling observers, as can be confirmed by checking once again that the tetrad connections 
Tnmo With final index 0 all vanish. 

Remarkably, the transformation (19.25) from coordinate to tetrad axes is just a Galilean transformation 
of space and time, which shifts the time axis by velocity 8 along the direction of motion, but which leaves 
unchanged both the time component of the time axis and all the spatial axes. In other words, the black 
hole behaves as if it were a river of space that flows radially inward through Galilean space and time at the 
Newtonian escape velocity. 
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19.2.3 Gullstrand-Painlevé fishies 


The Gullstrand-Painlevé line-element paints a picture of locally inertial frames falling like a river of space into 
a spherical black hole. What happens to fishies swimming in that river? Of course general relativity supplies 
a mathematical answer in the form of the geodesic equation of motion (19.27). Does that mathematical 
answer lead to further conceptual insight? 

Consider a fishy swimming in the Gullstrand-Painlevé river, with some arbitrary tetrad-frame 4-velocity 
u™, and consider a tetrad-frame 4-vector p” attached to the fishy. If the fishy is in free-fall, then the geodesic 
equation of motion for p* is as usual 

k 

ov +r nup” =0. (19.27) 
As remarked in §11.11, for a constant (for example Minkowski) tetrad metric, as here, the tetrad connections 
TÉ „ constitute a set of four generators of Lorentz transformations, one in each of the directions n. In 
particular TE „u” is the generator of a Lorentz transformation along the path of a fishy moving with 4- 
velocity u”. In a small (infinitesimal) time ôr, the fishy moves a proper distance E” = u” dr relative to the 
infalling river. This proper distance fE” = e",d”” = 6? (da” — B” dt) = dx” — BOT equals the distance 
dx” moved relative to the background Gullstrand-Painlevé-Cartesian coordinates, minus the distance 6" dT 
moved by the river. The geodesic equation (19.27) says that the change dp* in the tetrad 4-vector p* in the 
time 67 is 


õp! = -TE „OE p™ . (19.28) 


mn 


Equation (19.28) describes an infinitesimal Lorentz transformation —I*,,,6€" of the 4-vector p*. 

Equation (19.28) is quite general in general relativity: it says that as a 4-vector p” free-falls through a 
system of locally inertial tetrads, it finds itself Lorentz-transformed relative those tetrads. What is special 
about the Gullstrand-Painlevé-Cartesian tetrad is that the tetrad-frame connections, computed by the usual 
formula (11.54), are given by the coordinate gradient of the radial velocity (the following equation is valid 
component-by-component despite the non-matching up-down placement of indices) 


p p° 
b Ax 


=, = ap = 8 (a,b = 1,2,3) Í. (19.29) 


The same property, that the tetrad connections are a pure coordinate gradient, holds also for the Doran- 
Cartesian tetrad for a rotating black hole, equation (19.57). With the connections (19.29), the change 
dp* (19.28) in the tetrad 4-vector is 


õp? = —6B%p* , dp* =—68"p, (19.30) 


where 68° is the change in the velocity of the river as seen in the tetrad frame, 


age 
axl ` 


But equation (19.30) is nothing more than an infinitesimal Lorentz boost by a velocity change 66%. This 


68% = SE’ (19.31) 


560 Black hole waterfalls 


shows that a fishy swimming in the river follows the rules of special relativity, being Lorentz boosted by tidal 
changes 66% in the river velocity from place to place. 

Is it correct to interpret equation (19.31) as giving the change p° in the river velocity seen by a fishy? Of 
course general relativity demands that equation (19.31) be mathematically correct; the issue is merely one 
of interpretation. Shouldn’t the change in the river velocity really be 


OB" 
Ox” ’ 


where 6x” is the full change in the coordinate position of the fishy? No. Part of the change (19.32) in the 
river velocity can be attributed to the change in the velocity of the river itself over the time 67, which is 


2? 
ôb? = ba” 


(19.32) 


Ôi serb OL” with Tk e = B OT = p” dt. The change in the velocity relative to the flowing river is 
op° op? 
B° = (62" — ÔTiiver) = = (ôx” — BY ta) —— 19. 
B (ox driver) Hop (ox B ô maz ’ ( 9 33) 


which reproduces the earlier expression (19.31). Indeed, in the picture of fishies being carried by the river, 
it is essential to subtract the change in velocity of the river itself, as in equation (19.33), because otherwise 
fishies at rest in the river (going with the flow) would not continue to remain at rest in the river. 


19.3 Boyer-Lindquist tetrad 


The Boyer-Lindquist metric for an ideal rotating black hole was explored already in Chapter 9. With the 
tetrad formalism in hand, the advantages of the Boyer-Lindquist tetrad for portraying the Kerr-Newman 
geometry become manifest. With respect to the orthonormal Boyer-Lindquist tetrad, the electromagnetic 
field is purely radial, and the energy-momentum and Weyl tensors are diagonal. The Boyer-Lindquist tetrad 
is aligned with the principal (outgoing and ingoing) null congruences. 
The Boyer-Lindquist orthonormal tetrad is encoded in the Boyer-Lindquist metric 
2 R?A 


2 4 ai 2 
2 2 p 3, a2 , R sind a a 
ds? =- (dt — asin"0 dg)” + paga” + p°dd? + ; (ae dt) ; (19.34) 


where 
2M 7 
R=Vr4+a?, p=vVr2+a*cos?9@, A=1 ae =1-,. (19.35) 


Explicitly, the vierbein e™,, of the Boyer-Lindquist orthonormal tetrad is 


RVA/p 0 0 —asin?@RVA/p 
m 0 p/(RVA) 0 0 
= 19. 
ep 0 0 b 0 , (19.36) 
—asin0/p 0 0 R? sin 0/p 
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with inverse vierbein em” 


RIVA 0 0 a/(RVA) 
1 
ent =} : a : i , (19.37) 
a sind 0 0 1/sind 


With respect to the Boyer-Lindquist tetrad, only the time component At of the electromagnetic potential 


me | Qr 
A” = ane 0, 0, of (19.38) 


Only the radial components EF and B of the electric and magnetic fields are non-vanishing, and they are 


A™ is non-vanishing, 


given by the complex combination 


Q 
E+1IB = — JH _. , 19.39 
= (r — Ia cos 0)? ( ) 
or explicitly 
Q (r?—a? cos?8) 2Qar cos 0 


The electromagnetic field (19.39) satisfies Maxwell’s equations (22.56) with zero electric charge and current, 
j” =0, except at the singularity p = 0. 
The non-vanishing components of the tetrad-frame Einstein tensor Gmn are 


1 0 0 0 

Q? | 0 -1 0 0 

“pt to 0 104’ 
0 0 01 


(19.41) 


which is the energy-momentum tensor of the electromagnetic field. The non-vanishing components of the 
tetrad-frame Weyl tensor Ckimn are 


— $Co101 = $ C2323 = Co202 = Cogos = — C1212 = — Cigi = ReC , (19.42a) 


$ Co123 = Co213 = — Cos12 = ImC , (19.42b) 


where C is the complex Weyl scalar 


om : (m Q’ ) (19.43) 


(r — Ia cos 0)’ r+ Iacos@ 


In the Boyer-Lindquist tetrad, the photon 4-velocity v™ = e”,v! = e™,dx"/dX on the principal null 
congruences is radial, 


v = +—— y =i v? =0 ; v? =0. (19.44) 
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Exercise 19.4. Dragging of inertial frames around a Kerr-Newman black hole. What is the 
coordinate-frame 4-velocity u” of the Boyer-Lindquist tetrad through the Boyer-Lindquist coordinates? 


19.4 Doran waterfall 


The picture of space falling into a black hole like a river or waterfall works also for rotating black holes. For 
Kerr-Newman rotating black holes, the counterpart of the Gullstrand-Painlevé metric is the Doran (2000) 
metric. 

The space river that falls into a rotating black hole has a twist. One might have expected that the rotation 
of the black hole would be manifested by a velocity that spirals inward, but that is not the case. Instead, 
the river is characterized not merely by a velocity but also by a twist. The velocity and the twist together 
comprise a 6-dimensional river bivector wkm, equation (19.58) below, whose electric part is the velocity, and 
whose magnetic part is the twist. Recall that the 6-dimensional group of Lorentz transformations is generated 
by a combination of 3-dimensional Lorentz boosts and 3-dimensional spatial rotations. A fishy that swims 
through the river is Lorentz boosted by tidal changes in the velocity, and rotated by tidal changes in the 
twist, equation (19.67). 

Thanks to the twist, unlike the Gullstrand-Painlevé metric, the Doran metric is not spatially flat at 
constant free-fall time tg. Rather, the spatial metric is sheared in the azimuthal direction. Just as the 
velocity produces a Lorentz boost that makes the metric non-flat with respect to the time components, so 
also the twist produces a rotation that makes the metric non-flat with respect to the spatial components. 


19.4.1 Doran-Cartesian coordinates 


In place of the polar coordinates {r,6,d¢} of the Doran metric, equations (9.33), introduce corresponding 
Doran-Cartesian coordinates {x,y,z} with z taken along the rotation axis of the black hole (the black hole 
rotates right-handedly about z, for positive spin parameter a) 


x=Rsindcosdg, y= RsinOsindg , z=rcosé. (19.45) 


The metric in Doran-Cartesian coordinates z” = {tø, x°} = {tg, £, y, z}, is 


ds” = — dt + bag (dx* — B°a,dx”) (dz? — B’aydzx*) (19.46) 


where œ, is the rotational velocity vector 


ay ax 
ay = {1, a o} , (19.47) 


and 8” is the velocity vector 


BR zr yr zR 
p= 0, i : : 19.48 
p Rp Rp rp ( ) 


19.4 Doran waterfall 563 
The rotational velocity and radial velocity vectors are orthogonal 
ap" =0. (19.49) 


For the Kerr-Newman metric, the radial velocity 6 is 


papm 2Mr — Q? (19.50) 


R 


with — for black hole (infalling), + for white hole (outfalling) solutions. Horizons occur where 


bB=7F1, (19.51) 


with 8 = —1 for black hole horizons, and 6 = 1 for white hole horizons. Note that the squared magnitude 
Bu B" of the velocity vector is not 8°, but rather differs from 6? by a factor of R*/p?: 
, m B2 R? 

Bab" = BmP = pe . (19.52) 
The point of the convention adopted here is that (r) is any and only a function of r, rather than depending 
also on 0 through p. Moreover, with the convention here, 8 is F1 at horizons, equation (19.51). Finally, the 
4-velocity Ø” is simply related to 8 by 8” = (8/r) Or/Oa". 

The Doran-Cartesian metric (19.46) encodes a vierbein e™,, and inverse vierbein em” 


eu = OF Aup”, em! = OF, + AmB" . (19.53) 


Here the tetrad-frame components &m of the rotational velocity vector and 6” of the radial velocity vector 
are 


Am = Em" Ay, = Om, B= e™ BY = dB" , (19.54) 


which works thanks to the orthogonality (19.49) of a, and 6”. Equation (19.54) says that the covariant tetrad- 
frame components of the rotational velocity vector are the same as its covariant coordinate-frame components 
in the Doran-Cartesian coordinate system, Qm = Qp, and likewise the contravariant tetrad-frame components 
of the radial velocity vector are the same as its contravariant coordinate-frame components, 8” = 6". 


19.4.2 Doran-Cartesian tetrad 


Like the Gullstrand-Painlevé tetrad, the Doran-Cartesian tetrad Ym = {%0, Y1, Y2, Y3} is aligned with the 
Cartesian rest frame e, = {€:,,,€x,@y,e2} at infinity, and is parallel-transported, without precessing, by 
observers who free-fall from zero velocity and zero angular momentum at infinity, as can be confirmed by 
checking that the tetrad connections with final index 0 all vanish, Pino = 0, equation (19.20). 

Let || and L subscripts denote horizontal radial and azimuthal directions respectively, so that 


y = cos g yi +sin g Y2, Yı =— sin g yi + cos g ye , 


ej = cos $g €s + sin gey, e = -— sin og er + cos dg ey. 


(19.55) 
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Then the relation between Doran-Cartesian tetrad axes Ym and the tangent axes e, of the Doran-Cartesian 
metric (19.46) is 


Yo = Cty + Bea , (19.56a) 
Y = e; (19.56b) 
yi =e,— LRI Be š (19.56c) 
Y3 =e. (19.56d) 


The relations (19.56) resemble those (19.25) of the Gullstrand-Painlevé tetrad, except that the azimuthal 
tetrad axis y, is shifted radially relative to the azimuthal tangent axis e1. This shift reflects the fact that, 
unlike the Gullstrand-Painlevé metric, the Doran metric is not spatially flat at constant free-fall time, but 
rather is sheared azimuthally. 


19.4.3 Doran fishies 


The tetrad-frame connections equal the ordinary coordinate partial derivatives in Doran-Cartesian coordi- 
nates of a bivector (antisymmetric tensor) Wkm 


a A =i OWkm 
? 
" Ox” 


(19.57) 


which I call the river field because it encapsulates all the properties of the infalling river of space. The 
bivector river field wkm is 


Wkm = Akbm = Om Pr — €0kma ig , (19.58) 
where Bm = Nmn l”, the totally antisymmetric tensor E£kimn is normalized so that £0123 = —1, and the vector 
¢* points vertically upward along the rotation axis of the black hole 

r: 
o o bdr 
G = {0,0,0, ¢} 5 ¢ =a R2 š (19.59) 
co 
The electric part of wkm, where one of the indices is time 0, constitutes the velocity vector 6% 
Wia = B® (19.60) 


while the magnetic part of wkm, where both indices are spatial, constitutes the twist vector u° defined by 
pe = a = ECRM Oy Bm a ç? ; (19.61) 


The sense of the twist is that induces a right-handed rotation about an axis equal to the direction of u° by 
an angle equal to the magnitude of u°. In 3-vector notation, with y = u°, & = Qa, B= 8, C=C, 


w=axB+C. (19.62) 
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In terms of the velocity and twist vectors, the river field wkm is 
Wkm = _ 3 3 9 i ; (19.63) 


Note that the sign of the electric part B of wkm is opposite to the sign of the analogous electric field E 
associated with an electromagnetic field Fy, equation (4.46); but the adopted signs are natural in that the 
river field induces boosts in the direction of the velocity 8°, and right-handed rotations about the twist u°. 
Like a static electric field, the velocity vector 8° is the gradient of a potential 


B% = 5% = / Bdr, (19.64) 


but unlike a magnetic field the twist vector u° is not pure curl: rather, it is u° + ¢*% that is pure curl. 
Figure 19.2 illustrates the velocity and twist fields in a Kerr black hole. 
With the tetrad connection coefficients given by equation (19.57), the equation of motion (19.27) for a 
4-vector p* attached to a fishy following a geodesic in the Doran river translates to 
dp* 
dr" aay © 


In a proper time ôr, the fishy moves a proper distance 6 = u’dr relative to the background Doran- 


(19.65) 


Cartesian coordinates. As a result, the fishy sees a tidal change dw*,,, in the river field 


dw 
(4 = 0E” M. 19.66 
wm = E SF (19.66) 
Consequently the 4-vector p! is changed by 
pë + pë + ôw" mp” . (19.67) 


But equation (19.67) corresponds to an infinitesimal Lorentz transformation by ôw” m, equivalent to a Lorentz 
boost by 66% and a rotation by du". 

As discussed previously with regard to the Gullstrand-Painlevé river, §19.2.3, the tidal change dw*,,,, 
equation (19.66), in the river field seen by a fishy is not the full change ôx” Ow*,,/Ox” relative to the 
background coordinates, but rather the change relative to the river 


k 
Swm = (6a" — day, Oui 


Ou" m 
fiver) Ox” 


Ox” ’ 


= [öx — B” (ôte — asin?ð ôpe)] (19.68) 


with the change in the velocity and twist of the river itself subtracted off. 

That there exists a tetrad (the Doran-Cartesian tetrad) where the tetrad-frame connections are a coor- 
dinate gradient of a bivector, equation (19.57), is a peculiar feature of ideal black holes. It is an intriguing 
thought that perhaps the 6 physical degrees of freedom of a general spacetime might always be encoded in 
the 6 degrees of freedom of a bivector, but that is not true. 
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Figure 19.2 (Upper panel) velocity 6% and (lower panel) twist u° vector fields for a Kerr black hole with spin parameter 
a = 0.96. Both vectors lie, as shown, in the plane of constant free-fall azimuthal angle gø. The vertical bar in the 


lower panel shows the length of a twist vector corresponding to a full rotation of 360°. 


Exercise 19.5. River model of the Friedmann-Lemaître-Robertson-Walker metric. Show that the 


flat FLRW line-element 
ds? = — dt? + a? (dx? + 2°do’) (19.69) 


can be re-expressed as 
ds? = — dt? + (dr — Hr dt)? + r7do? , (19.70) 
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where r = aw is the proper radial distance, and H = å/a is the Hubble parameter. Interpret the line- 
element (19.70). Is there a generalization to a non-flat FLRW universe? 


Exercise 19.6. Program geodesics in a rotating black hole. Write a graphics program that uses the 
prescription (19.66) to draw geodesics of test particles in an ideal (Kerr-Newman) black hole, expressed in 
Doran-Cartesian coordinates. Attach 3D bodies to your test particles, and use the same prescription (19.66) 
to rotate the bodies. Implement an option to translate to Boyer-Lindquist coordinates. 
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General spherically symmetric spacetimes 


20.1 Spherical spacetime 


Spherical spacetimes have 2 physical degrees of freedom. Spherical symmetry eliminates any angular degrees 
of freedom, leaving 4 adjustable metric coefficients git, gtr, grr, and gogo. But coordinate transformations 
of the time t and radial r coordinates remove 2 degrees of freedom, leaving a spherical spacetime with a 
net 2 physical degrees of freedom. Spherical spacetimes have 4 distinct Einstein equations (20.39). But 2 of 
the Einstein equations serve to enforce energy-momentum conservation, so the evolution of the spacetime is 
governed by 2 Einstein equations, in agreement with the number of physical degrees of freedom of spherical 
spacetime. 

The 2 degrees of freedom mean that spherical spacetimes in general relativity have a richer structure than 
in Newtonian gravity, which has only one degree of freedom, the Newtonian potential ®. The richer structure 
is most striking in the case of the mass inflation instability, Chapter 21, which is an intrinsically general 
relativistic instability, with no Newtonian analogue. 


20.2 Spherical line-element 


The spherical line-element adopted in this Chapter is, in spherical polar coordinates z” = {t,r, 0, o}, 


ds? = — a° dt? + 2 (dr — abo dt)? + r2do? |. (20.1) 
1 


Here r is the circumferential radius, defined such that the circumference around any great circle is 27r. 
The line-element (20.1) is in ADM form (17.8) with lapse a and shift a(o. The notation Bm is motivated 
by fact that {8o, 81,0,0} forms a tetrad-frame 4-vector, equation (20.9). As expounded in §11.3, through 
ds? = tmne™,e", dxdx” the line-element (20.1) encodes not only a metric, but also a locally inertial tetrad 
Ym = (0,71; Y2, Y3}. The off-diagonal character of the line-element allows the tetrad to flow through the 
coordinates. This flexibility is especially useful for black holes, since no locally inertial frame can remain at 
rest inside the horizon of a black hole. 
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The vierbein e”,, can be read off from the line-element (20.1): 


e° dz" = adt , (20.2a) 
1 

ey dx” = A (dr — aBo dt) , (20.2b) 
1 

edz” = rdé , (20.2c) 

e? dx” = rsinb do . (20.2d) 


The vierbein e”,, and inverse vierbein €m” corresponding to the spherical line-element (20.1) are 


a 0 0 0 1/a Bo 0 0 

m _ | yer 1/4 0 0 a 0 A 0 0 

È HT 0 0 r 0 ` Eem = 0 0 1/r 0 (20.3) 
0 0 0 rsiné 0 0 0 1/(rsinð) 


As in the ADM formalism, §17.1, the tetrad time axis yo is chosen to be orthogonal to hypersurfaces of 
constant time t. The directed derivatives Op and 0, along the time and radial tetrad axes yo and yı are 


ð 10 
Or! aðt 


ð ð o 
o = eo” | , =e," = B,—. 20.4 

o = €o Boa. el ag biz (20.4) 
The tetrad-frame 4-velocity u™ of a person at rest in the tetrad frame is by definition u™ = {1,0,0,0}. It 


follows that the coordinate 4-velocity u” of such a person is 
u! = emu” = eg" = {1/a, bo, 0,0} . (20.5) 


A person instantaneously at rest in the tetrad frame satisfies dr/dt = abo according to equation (20.5), so it 
follows from the line-element (20.1) that the proper time 7 of a person at rest in the tetrad frame is related 
to the coordinate time t by 


dr =adt in tetrad rest frame . (20.6) 


The directed time derivative Op is just the proper time derivative along the worldline of a person continuously 
at rest in the tetrad frame (and who is therefore not in free-fall, but accelerating with the tetrad frame), 
which follows from 

d dz” ð o 


ë =u" = 
dr dr dx!" dae Om = do - (20.7) 


By contrast, the proper time derivative measured by a person who is instantaneously at rest in the tetrad 
frame, but is in free-fall, is the covariant time derivative 


D dx” 


pr © g Pe 5D 5 u Dm= Do. (20.8) 


Since the coordinate radius r has been defined to be the circumferential radius, a gauge-invariant definition, 
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it follows that the tetrad-frame gradient ôm of the coordinate radius r is a tetrad-frame 4-vector (a coordinate 
gauge-invariant object), 

Or A 
Om? = Cm! aa = Cm’ = Bm = {8o,61,0,0} atetrad 4-vector . (20.9) 
x 


This accounts for the notation 69 and (6, introduced above. The component $9 can be interpreted as the 
radial velocity of the tetrad frame, equation (20.5), 


dr 
==. 
The component ĝı can be interpreted as the energy per unit mass of an object at rest in the tetrad frame, 
equation (20.52). 
Since Bm is a tetrad 4-vector, its scalar product with itself must be a scalar. This scalar defines the interior 
mass M (t,r), also called the Misner-Sharp mass (Misner and Sharp, 1964), by 


(20.10) 


2M 
1— — = 6,,8™ = — 62 + 6?| a coordinate and tetrad scalar . (20.11) 
r 


The interpretation of M as the interior mass will become evident below, §20.9. 
The horizon function A is defined by 


2M 
A= B" By, = 1- —. (20.12) 
r 
Apparent horizons occur where the horizon function is zero, A = 0, that is, where the 4 vector Bm is null, a 
gauge-invariant condition. The condition for an apparent horizon is 


r=2M, (20.13) 


which holds in any spherically symmetric geometry, not just the Schwarzschild geometry. In general the 
interior mass M varies with radius r; only in the Schwarzschild geometry is the interior mass M constant. 

Inside horizons, where the horizon function A is negative, the velocity 89 cannot be zero: the tetrad must 
move superluminally through the radial coordinate. Similarly, outside horizons, where the horizon function 
A is positive, the energy per unit mass 3, cannot be zero. Inside horizons, the energy per unit mass 6, can 
be either positive, in which case the tetrad frame is called ingoing, or negative, in which case the tetrad 
frame is called outgoing. The tetrad can switch between ingoing and outgoing only inside horizons. 


Exercise 20.1. Apparent horizon. Show that radial null geodesics in a spherical geometry satisfy 


T Z a(o + fi). (20.14) 


An apparent horizon occurs where outgoing radial null geodesics are not moving radially, dr /dt = 0. Conclude 
that an apparent horizon occurs where (choosing a and (, positive without loss of generality) 


Bo =f, . (20.15) 


20.3 Rest diagonal line-element 571 
20.3 Rest diagonal line-element 


Although this is not the choice adopted here, the line-element (20.1) can always be brought to diagonal form 
by a coordinate transformation t + t, (subscripted x for diagonal) of the time coordinate. The t-r part of 
the metric is 


gu dt? + 2 gir dt dr + grr dr? = (gee dt + gir dr)? + (gitgrr — 97.) dr?) 7 (20.16) 


Iu 


This can be diagonalized by choosing the time coordinate t, such that 
f dt; = Ott dt + Gtr dr (20.17) 


for some integrating factor f(t,r). Equation (20.17) can be solved by choosing t, to be constant along 
integral curves 


SS, 20.18 
dt Gtr ( ) 
The resulting diagonal rest line-element is 
ds? = — of, dt?, + ee + r°do* (20.19) 
x" 1-2M/r f 


The line-element (20.19) corresponds physically to the case where the tetrad frame is taken to be at rest in 
the spatial coordinates, 39 = 0, as can be seen by comparing it to the earlier line-element (20.1). In changing 
the tetrad frame from one moving at dr/dt = afo to one that is at rest (at constant circumferential radius r), 
a tetrad transformation has in effect been done at the same time as the coordinate transformation (20.17), 
the tetrad transformation being precisely that needed to make the line-element (20.19) diagonal. The metric 
coefficient grr in the line-element (20.19) follows from the fact that 67 = 1 — 2M/r when bo = 0, equa- 
tion (20.11). The transformed time coordinate t, is unspecified up to a transformation t, — f(t). If the 
spacetime is asymptotically flat at infinity, then a natural way to fix the transformation is to choose t, to 
be the proper time at rest at infinity. 


20.4 Comoving diagonal line-element 


Although once again this is not the path followed here, the line-element (20.1) can also be brought to diagonal 
form by a coordinate transformation r — r,., where, analogously to equation (20.17), r, is chosen to satisfy 


1 
f dry. = Gir dt + Grr dr = A (dr — abo dt) (20.20) 
1 


for some integrating factor f(t,r). The new coordinate r, is constant along the worldline of an object at 
rest in the tetrad frame, with dr/dt = ao, equation (20.5), so r, can be regarded as a comoving radial 
coordinate. The comoving radial coordinate r, could for example be chosen to equal the circumferential 
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radius r at some fixed instant of coordinate time t (say t = 0). The diagonal comoving line-element in 
this comoving coordinate system takes the form 


ds? = —a*dt? + X’ dr? + r7do? , (20.21) 


where the circumferential radius r(t,r,.) is considered to be a function of time t and the comoving radial 
coordinate r,,. Whereas in the rest line-element (20.19) the tetrad was changed from one that was moving 
at dr/dt = aßo to one that was at rest, here the transformation keeps the tetrad unchanged. In both the 
rest and comoving diagonal line-elements (20.19) and (20.21) the tetrad is at rest relative to the respective 
radial coordinate r or rą; but whereas in the rest line-element (20.19) the radial coordinate was fixed to be 
the circumferential radius r, in the comoving line-element, (20.21) the comoving radial coordinate r, is a 
label that follows the tetrad. Because the tetrad is unchanged by the transformation to the comoving radial 
coordinate r,,, the directed time and radial derivatives ð and 0; are unchanged: 


ip) L? 


ON Del. es B. » 


(20.22) 


ð 
+ Bo a. 


t 


20.5 Tetrad connections 


Now turn the handle to proceed towards the Einstein equations. The non-vanishing tetrad connections 
coefficients l kmn corresponding to the spherical line-element (20.1) are 


Tioo = ho , (20.23a 
Tio = hi , (20.23b 
T202 = I'so3 = on ; (20.23c 
P212 = l313 = a ; (20.23d 
P323 = core , (20.23e 


where ho is the proper radial acceleration (minus the gravitational force) experienced by a person at rest in 
the tetrad frame 


Olna 
ho — ô lna = By Ər ; (20.24) 
and hı is the “Hubble parameter” of the radial flow, as measured in the tetrad rest frame, defined by 
al 
hı = fy Po = Oo In By . (20.25) 


The interpretation of ho as a proper acceleration and hı as a radial Hubble parameter goes as follows. The 
tetrad-frame 4-velocity u™ of a person at rest in the tetrad frame is by definition u’ = {1,0,0,0}. If the 
person at rest were in free fall, then the proper acceleration would be zero, but because this is a general 
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spherical spacetime, the tetrad frame is not necessarily in free fall. The proper acceleration experienced by 
a person continuously at rest in the tetrad frame is the proper time derivative Du™ /Dr of the 4-velocity, 
which is 


Du™ 


= = Dou” = ou” +TRu® =T™ = {0, rfo, 0,0} = {0, ho, 0,0} , (20.26) 


the first step of which follows from equation (20.8). Similarly, a person at rest in the tetrad frame will 
measure the 4-velocity of an adjacent person at rest in the tetrad frame a small proper radial distance 5€! 
away to differ by 6€1D,u™. The Hubble parameter of the radial flow is thus the covariant radial derivative 
D,u™, which is 


Diu” = ðu” +r ea =T = {0, r}, 0,0} = {0, h1,0,0} . (20.27) 


Confined to the (yo~y1)-plane (that is, considering only Lorentz transformations in the (t-r)-plane, which 
is to say radial Lorentz boosts), the acceleration ho and Hubble parameter hı constitute the components of 
a tetrad-frame 2-vector hn = {ho, hi}: 


hn = Dion : (20.28) 


The Riemann tensor, equations (20.30) below, involves covariant derivatives Dmhn of hn. These should be 
interpreted either as 4D covariant derivatives of the 4-vector hn = {ho,hi,0,0} with zero angular parts, 
or equivalently as 2D covariant derivatives DP hn confined to the (yo-‘y1)-plane. The contraction h"h, = 
— hê + h? is a scalar with respect to radial Lorentz boosts. 

Since h; is a kind of radial Hubble parameter, it can be useful to define a corresponding radial scale factor 
by 


The scale factor AÀ is the same as the A in the comoving line-element of equation (20.21). This is true because 
hı is a tetrad connection and therefore coordinate gauge-invariant, and the line-element (20.21) is related 
to the line-element (20.1) being considered by a coordinate transformation r > rą that leaves the tetrad 
unchanged. 
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20.6 Riemann, Einstein, and Weyl tensors 


The non-vanishing components of the tetrad-frame Riemann tensor Rkimn corresponding to the spherical 


line-element (20.1) are 


Ro101 = Diho — Dohi , (20.30a 
Ro202 = Roz03 = — T Dob , (20.30b 
Rıi212 = Ri313 = — ~ Di Br , (20.30c 
Ro212 = Ro313 = — Dob =— T Dipo ; (20.30d 
R2323 = = ; (20.30e 


where Dm denotes the covariant derivative as usual. The non-vanishing components of the tetrad-frame Ricci 


tensor Rkm are 


Roo = Ro101 + 2Ro202 , (20.31a) 

Rıı = — Roior + 2R1212 , (20.31b) 

Roi = 2Ro212 , (20.31c) 

R22 = R33 = — Ro202 T Rı212 ae R2323 , (20.31d) 

whence 
2 
Roo = Dıho = Dohy = — Do bo 5 (20.32a 
r 
2 
Ry = — Dıho + Dohi — -D181 , (20.32b 
r 
2 2 
Roi = — Poh =- „Pibo , (20.32c 
1 2M 
Rog = R33 = z Pobo — „Pb + PE j (20.32d 
The Ricci scalar is 
4 4 4M 
R = — 2Dıho + 2Dohı + z Pobo = „Pip + 7 (20.33) 
The non-vanishing components of the tetrad-frame Einstein tensor G*™ are 

G9? = 2 Ri212 + R2323 , (20.34a) 
G1! = 2 Roooa — Roaos , (20.34b) 
CMe — 2 Ro212 , (20.34c) 
G? = G? = Roioi + Rozo2 — R1212 , (20.34d) 
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whence 
2 M 
Ge = 2 (- Dif + =) : (20.35a) 
r r 
2 M 
Gis (- Dobo — *) (20.35b) 
r r 
Gol = 20 = 20 20.35 
=r o= 7 1Po , (20.35¢) 
22 33 _ ni 1 = 
G* =Gr? = Dyho Doh, + F (D181 Dobo) $ (20.35d) 


The non-vanishing components of the tetrad-frame Weyl tensor Ckimn are 


$ Co1o1 = — Co202 = — Co303 = C1212 = C1313 = — 4 C2323 = C , (20.36) 


where C is the Weyl scalar (the spin 0 component of the Weyl tensor), 


1 1 M 
C= 5 (Roro: — Roz02 + R1212 — R2323) = 6 (G — Gt! + G”) — a (20.37) 
20.7 Einstein equations 
The tetrad-frame Einstein equations 
Gk" = 8nTh™ (20.38) 
imply that 
GO Gl 9 o p f 0 o 
Gt Gu 9 0 fi fpo 0 
= 8rT"™ = 8 20.39 
0 0 G2 0 : TIo Oop 0 ey) 
0 0 0 GB 000m 


where p = T% is the proper energy density, f = T°! is the proper radial energy flux, p = T1! is the proper 
radial pressure, and p, = T?? = T°? is the proper transverse pressure. Proper here means as measured by a 
person at rest in the tetrad frame. 


20.8 Choose your frame 


So far the radial motion of the tetrad frame has been left unspecified. Any arbitrary choice can be made. 
For example, the tetrad frame could be chosen to be at rest, 


Bo=0, (20.40) 
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as in the Schwarzschild or Reissner-Nordstrém line-elements. Alternatively, the tetrad frame could be chosen 
to be in free-fall, 


ho =0, (20.41) 


as in the Gullstrand-Painlevé line-element. For situations where the spacetime contains matter, one natural 
choice is the centre-of-mass frame, defined to be the frame in which the energy flux f is zero 


G” =8rf=0. (20.42) 


Whatever the choice of radial tetrad frame, tetrad-frame quantities in different radial tetrad frames are 
related to each other by a radial Lorentz boost. 


20.9 Interior mass 


Equations (20.35b) with the middle expression of (20.35c), and (20.35a) with the final expression of (20.35c), 
respectively, along with the definition (20.11) of the interior mass M, and the Einstein equations (20.39), 
imply (note that Dm M = mM since M is a scalar) 


1 1 
= OoM 20.43 
DF, ( g 2 nf) ; ( a) 
_1 1 3M — bof (20.43b) 
aa Bi \4nr2 + i : : 
In the centre-of-mass frame, f = 0, these equations reduce to 
oM = — 4rr’? bop , (20.44a) 
ðM = 4rr° bı p . (20.44b) 


Equations (20.44) amply justify the interpretation of M as the interior mass. The first equation (20.44a) can 
be written 


—— =—Anrp, (20.45) 


where dM/dr = 09M/Oor is the total derivative of the mass M with respect to radius r along the path of 
the matter, in the centre-of-mass frame. Equation (20.45) can be recognized as an expression of the first law 
of thermodynamics, 


dE+pdV =0, (20.46) 
with mass-energy E equal to M and volume V equal to nr’. The second equation (20.44b) can be written, 
since 0, = bı 0/Or, equation (20.4), 


OM 


ap Arr’ p , (20.47) 


20.10 Energy-momentum conservation 577 


which looks exactly like the Newtonian relation between interior mass M and density p. Equation (20.47) is 
the Hamiltonian constraint for spherically symmetric spacetimes. 

Actually, the apparently Newtonian equation (20.47) is deceiving. The total mass-energy dM in a radial 
shell should be distinguished from the proper mass-energy dm of the shell in its own frame. The proper 
3-volume element d?r in the centre-of-mass tetrad frame is given by', equation (15.86), 


r? sin 6 drdéd@ 
By 


where e = |e*,,| is the determinant of the the 3 x 3 spatial vierbein matrix. Thus the proper 3-volume element 
dV = d®r of a radial shell of width dr is 


dr = e d'r’? = (20.48) 


Anr?dr 
dV = 
By 


Consequently the proper mass-energy dm associated with the proper density p in a proper radial volume 
element dV is 


(20.49) 


Arr? pdr 


dm = pdV = i (20.50) 
By 
whereas the total mass-energy dM from equation (20.47) is 
dM = p4nr’dr = BipdV . (20.51) 
The factor 8, can be interpreted as the energy per unit mass of the matter, 
dM 
A=: (20.52) 
dm 
The difference between the total and proper mass-energy 
dM — dm = (pı — 1)pdV (20.53) 
can be interpreted as a combination of the kinetic and gravitational energy of the matter. 
20.10 Energy-momentum conservation 
Covariant conservation of the Einstein tensor DmG™” = 0 implies conservation of energy-momentum 


DmT™” = 0. The transverse component, n = 2,3, of the conservation equations vanish identically. The 
remaining two non-trivial equations represent conservation of energy and of radial momentum, and are 


2 2 

DT = Oop + oo + pi) +h (p +p) + (a+ 21 +2ho)f=0, (20.54a) 
2 2 

DmT™ = dp + A (p— pi) +tho(p+p)+ (a + 2h + 2h) f =0. (20.54b) 


1 The same conclusion follows from considering the spherical line-element (20.1). In the tetrad frame, by construction 
dr — aßo dt = 0, and the proper time satisfies dr = a dt. At constant proper time, the proper radial distance is dr/81, from 
the line-element (20.1). 
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In the centre-of-mass frame, f = 0, these energy-momentum conservation equations reduce to 


ap + o+ p1) hi(p+p)=0, (20.55a) 
ap+ p- pi) +ho(p +p) =0. (20.55b) 


In a general situation where the mass-energy is a sum over several individual components x, 


oe yo Te, (20.56) 


species x 


the individual mass-energy components x of the system each satisfy an energy-momentum conservation 
equation of the form 


Dll" = F? , (20.57) 


where F”? is the flux of energy into component x. Einstein’s equations enforce energy-momentum conservation 
of the system as a whole, so the sum of the energy fluxes must be zero 


>X O ee (20.58) 


species x 


20.10.1 First law of thermodynamics 


For an individual species x, the energy conservation equation (20.54a) in the centre-of-mass frame of the 
species, f, = 0, can be written 


Dy T™ = Oops + (Pa + Pig)Oolnr’? + (pz +pr)ðoln àr = F? , (20.59) 


where Àx is the radial “scale factor,” equation (20.29), in the centre-of-mass frame of the species (the scale 
factor is different in different frames). Equation (20.59) can be recognized as an expression of the first law 
of thermodynamics for a volume element V of species x, in the form 


V-"Go(p2V) + Pie Ve O0V1 + pa Vi OV] = F? , (20.60) 
with transverse volume (area) V œ r?°, radial volume (width) V, œ Az, and total volume V œ Vı V,. The 
flux F? on the right hand side is the heat per unit volume per unit time going into species x. If the pressure 
of species x is isotropic, pix = px, then equation (20.60) simplifies to 


v7} [20(p2V) 4+ pe OV) = Fo, (20.61) 


with volume V œ r?Az. 
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20.11 Structure of the Einstein equations 


The spherically symmetric spacetime under consideration is described by 3 vierbein coefficients, a, 89, and (1. 
However, some combination of the 3 coefficients represents a gauge freedom, since the spherically symmetric 
spacetime has only two physical degrees of freedom. As commented in §20.8, various gauge-fixing choices can 
be made, such as choosing to work in the centre-of-mass frame, f = 0. 

Equations (20.35) give 4 equations for the 4 non-vanishing components of the Einstein tensor. The two 
expressions for G! are identical when expressed in terms of the vierbein and vierbein derivatives, so are not 
distinct equations. Conservation of energy-momentum of the system as a whole is built in to the Einstein 
equations, a consequence of the Bianchi identities, so 2 of the Einstein equations are effectively equivalent 
to the energy-momentum conservation equations (20.54). If the matter equations are arranged to satisfy 
energy-momentum conservation, as they should, then 2 of the Einstein equations are redundant, and can be 
dropped. 

This leaves 2 independent Einstein equations to describe the 2 physical degrees of freedom of the spacetime. 
The 2 equations may be taken to be the evolution equations (20.35c) and (20.35b) for the velocity 89 and 


energy per unit mass p1, 
M 
Dobo = — = — 4rrp|, (20.62a) 
r 


Dopı =4rrf |, (20.62b) 


which are valid for any choice of tetrad frame, not just the centre-of-mass frame. The covariant derivatives 
on the left hand side of equations (20.62) are more explicitly 


Do8o = obo — hoi , DoBi = O81 — hobo , (20.63) 


where ho is the proper radial acceleration, equation (20.24). 

Equations (20.62) can be taken to be the fundamental equations governing the gravitational field in spher- 
ically symmetric spacetimes. It is these equations that are responsible (to the extent that equations may 
be considered responsible) for the strange internal structure of Reissner-Nordstr6m black holes, and for 
mass inflation. The coefficient 69 equals the coordinate radial 4-velocity dr/dr = or = bo of the tetrad 
frame, equation (20.5), and thus equation (20.62a) can be regarded as giving the proper radial acceleration 
D?r/ Dr? = DBo/Dr = Doßo of the tetrad frame as measured by a person who is in free-fall and instanta- 
neously at rest in the tetrad frame. If the acceleration is measured by an observer who is continuously at rest 
in the tetrad frame (as opposed to being in free-fall), then the proper acceleration is 0989 = Dobo + hor. 
The presence of the extra term hoĝı, proportional to the proper acceleration ho actually experienced by 
the observer continuously at rest in the tetrad frame, reflects the principle of equivalence of gravity and 
acceleration. 

The right hand side of equation (20.62a) can be interpreted as the radial gravitational force, which consists 
of two terms. The first term, —M/r?, looks like the familiar Newtonian gravitational force, which is attractive 
(negative, inward) in the usual case of positive mass M. The second term, —47rp, proportional to the radial 
pressure p, is what makes spherical spacetimes in general relativity interesting. In a Reissner-Nordstrém black 
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hole, the negative radial pressure produced by the radial electric field produces a radial gravitational repulsion 
(positive, outward), according to equation (20.62a), and this repulsion dominates the gravitational force at 
small radii, producing an inner horizon. In mass inflation, the (positive) radial pressure of relativistically 
counter-streaming outgoing and ingoing streams just above the inner horizon dominates the gravitational 
force (inward), and it is this that drives mass inflation. 

Like the second half of a vaudeville act, the second Einstein equation (20.62b) also plays an indispensable 
role. The energy per unit mass 3; = O,r on the left hand side is the proper radial gradient of the circumfer- 
ential radius r measured by a person at rest in the tetrad frame. The sign of 6, determines which way an 
observer at rest in the tetrad frame thinks is “outwards,” the direction of larger circumferential radius r. A 
positive G; means that the observer thinks the outward direction points away from the black hole, while a 
negative 3; means that the observer thinks the outward direction points towards from the black hole. Outside 
the outer horizon (1 is necessarily positive, because Bm must be spacelike there. But inside the horizon 681 
may be either positive or negative. A tetrad frame can be defined as “ingoing” if the proper radial gradient 
81 is positive, and “outgoing” if 61 is negative. In the Reissner-Nordstrém geometry, ingoing geodesics have 
positive energy, and outgoing geodesics have negative energy. However, the definition of outgoing or ingoing 
based on the sign of 81 is general — there is no need for a timelike Killing vector such as would be necessary 
to define the (conserved) energy of a geodesic. 

Equation (20.62b) shows that the proper rate of change Do; in the radial gradient 3, measured by an 
observer who is in free-fall and instantaneously at rest in the tetrad frame is proportional to the radial energy 
flux f in that frame. But ingoing observers (3; positive) tend to see energy flux pointing away from the black 
hole (f positive), while outgoing observers (3 negative) tend to see energy flux pointing towards the black 
hole (f negative). Thus the change in ( tends to be in the same direction as 61, amplifying 3, whatever its 
sign. 


Exercise 20.2. Birkhoff’s theorem. Prove Birkhoff’s theorem from equations (20.62). Birkhoff’s theorem 
states that any spherically symmetric spacetime that is devoid of energy-momentum between some inner and 
outer radii is Schwarzschild between those radii. 


Concept question 20.3. Naked singularities in spherical spacetimes? A singularity forms at zero 
radius, r = 0, when an apparent horizon develops there, that is, when space starts falling into r = 0 at 
the speed of light. Can geodesics emerge from such a singularity? A singularity from which geodesics can 
emerge is called a naked singularity. Answer. The surprising answer is yes, naked singularities can occur 
in spherical spacetimes. To see that this conclusion is surprising, consider the following “proof” that naked 
singularities do not exist. The proof relies on the assumption that the interior mass M and radial pressure 
p are both positive, or more precisely, that M/r? + 4rp is positive; this is certainly a reasonable physical 
assumption for real black holes. As seen in Exercise 20.1, outgoing and ingoing radial null geodesics in 
a spherical spacetime follow dr/dt = a(S + 81), equation (20.14). An apparent horizon forms when the 
outgoing null geodesic ceases to move outward, 6o + 81 = 0. The outgoing and ingoing null geodesics bound 
the future lightcone emerging from the apparent horizon: all radial geodesics, timelike or lightlike, must lie 
inside or on the lightcone, so that dr/dt < 0 for all radial geodesics at an apparent horizon, with dr/dt = 0 for 
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the outgoing radial null geodesic. But the Einstein equation (20.62a), which is valid in any frame arbitrarily 
Lorentz-boosted in the radial direction, shows that 69, which equals dr/dr in that Lorentz-boosted frame, 
must decrease along any geodesic, as long as M/r? + 4rp is positive. Thus once dr/dr is zero or negative 
along a geodesic, it cannot become positive. In particular, this holds true at zero radius, r = 0: as long as 
M/r? + 4rp is positive, once dr/dr is negative at r = 0, indicating the appearance of a singularity, then 
dr/dr cannot become positive, and therefore no light ray can emerge from the singularity. 

The foregoing “proof” that naked singularities cannot exist in spherical spacetimes is flawed because the 
infall velocity 6o can be multi-valued at the point at zero radius where a singularity first forms. Section 20.16 
gives an explicit example for the case of spherically symmetric collapse of pressureless dust. 


20.11.1 Comment on the lapse a 


Whereas the Einstein equations (20.62) give evolution equations for the vierbein coefficients 89 and (1, 
there is no evolution equation for the vierbein coefficient a, the lapse. Indeed, the Einstein equations involve 
the lapse a only through the connections hm, equations (20.23a) and (20.23b), and thus only as the radial 
derivative Olna/Or, equations (20.24) and (20.25). This reflects the fact that, even after the tetrad frame 
is fixed, there is still a coordinate freedom t > t (t) in the choice of coordinate time t. Under such a gauge 
transformation, a transforms as a > a’ = f(t)a where f(t) = 0t/Ot’ is an arbitrary function of coordinate 
time t. Only the radial derivative 0lna/Or is independent of this coordinate gauge freedom, and thus the 
tetrad-frame Einstein equations depend, through hm, only on this radial derivative, not on a itself. 

These results are consistent with the arguments in §16.15.1 and §17.2.3 that the lapse a can be treated as 
a gauge variable, arbitrarily adjustable by a coordinate transformation of the time coordinate. 

A possible gauge choice is to set a = 1 everywhere. According to equation (20.24), this choice requires 
that the proper acceleration in the tetrad-frame vanish, ho = 0, that is, the tetrad-frame is everywhere in 
free fall, as for example in the Gullstrand-Painlevé line-element. I like to think of a free-fall frame as being 
realised physically by tracer “dark matter” particles that free-fall radially (from zero velocity, typically) at 
infinity, and stream freely, without interacting, through any actual matter that may be present. 


20.12 Comparison to ADM (3+1) formulation 
The line-element (20.1) is in ADM form with lapse a, shift ao, and spatial metric 
Jag = diag(1/87,7r?, r° sin?6) . (20.64) 
The non-vanishing components of the acceleration Ka = Taoo and of the extrinsic curvature Kab = Taob are 
ki =ho, (20.65a) 


Ky = hy 5 Ko2 = K33 = 7 . (20.65b) 
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20.13 Spherical electromagnetic field 


The internal structure of a charged black hole resembles that of a rotating black hole because the negative 
pressure (tension) of the radial electric field produces a gravitational repulsion analogous to the centrifugal 
repulsion in a rotating black hole. Since it is much easier to deal with spherical than rotating black holes, it 
is common to use charge as a surrogate for rotation in exploring black holes. 


20.13.1 Electromagnetic field 


The assumption of spherical symmetry means that any electromagnetic field can consist only of a radial elec- 
tric field (in the absence of magnetic monopoles). The only non-vanishing components of the electromagnetic 
field Fmn are then 


=i = Fo = E = 7 , (20.66) 
r 


where £ is the radial electric field, and Q(t, r) is the interior electric charge. Equation (20.66) can be regarded 
as defining what is meant by the electric charge Q interior to radius r at time t. 


20.13.2 Maxwell’s equations 


A radial electric field automatically satisfies the two source-free Maxwell equations. For the radial electric 
field (20.66), the other two Maxwell’s equations, the sourced ones (16.34), are 


aQ = 4nr’q , (20.67a) 
Q = —4nr7j , (20.67b) 


where q = j° is the proper electric charge density and j = j is the proper radial electric current density in 
the tetrad frame. 


20.13.3 Electromagnetic energy-momentum tensor 


For the radial electric field (20.66), the electromagnetic energy-momentum tensor (16.150) in the tetrad 
frame is the diagonal tensor 


1 0 00 
mo © 0 -1 0 0 

fe et | OO aa a) 
0 0 01 


The radial electric energy-momentum tensor is independent of the radial motion of the tetrad frame, which 
reflects the fact that the electric field is invariant under a radial Lorentz boost. The energy density pe and 
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radial and transverse pressures pe and pie of the electromagnetic field are the same as those from a spherical 
charge distribution with interior electric charge Q in flat space 
Q? E? 
Bnri Bn 
The non-vanishing components of the covariant derivative DmT}”” of the electromagnetic energy-mom- 
entum (20.68) are 


Pe —Pe = Ple (20.69) 


DmTe™ = dope + Pe = Thr dQ = — a ZJE, (20.70a) 
4 
Dit? by. 4 p - g = gp, (20.70b) 
Tr Arr r 


The first expression (20.70a), which gives the rate of energy transfer out of the electromagnetic field as the 
current density j times the electric field E, is the same as in flat space. The second expression (20.70b), 
which gives the rate of transfer of radial momentum out of the electromagnetic field as the charge density q 
times the electric field E, is the Lorentz force on a charge density q, and again is the same as in flat space. 


20.14 General relativistic stellar structure 


Even with the assumption of spherical symmetry, it is by no means easy to solve the system of partial 
differential equations that comprise the Einstein equations coupled to mass-energy of various kinds. However, 
the system simplifies in some cases. 

One simple case is that of a system that is not only spherically symmetric but also static, such as a 
star. In this case all time derivatives can be taken to vanish, ô/ðt = 0, and, since the centre-of-mass frame 
coincides with the rest frame, it is natural to choose the tetrad frame to be at rest, 6o = 0. The Einstein 
equation (20.62b) then vanishes identically, while the Einstein equation (20.62a) becomes 


M 
hobi =— + 4rrp , (20.71) 
r 


which expresses the proper acceleration ho in the rest frame in terms of the familiar Newtonian gravitational 
force M/r? plus a term 4rrp proportional to the radial pressure. The radial pressure p, if positive as is the 
usual case for a star, enhances the inward gravitational force, helping to destabilize the star. Because ĝo is 
zero, the interior mass M given by equation (20.11) reduces to 


1-—2M/r = 6}. (20.72) 


When equations (20.71) and (20.72) are substituted into the momentum equation (20.55b), and if the pressure 
is taken to be isotropic, so p} = p, the result is the Oppenheimer- Volkov equation for general relativistic 
hydrostatic equilibrium 


Op (p+ p)(M + 4rr°p) 
Or r2(1—2M/r) 


(20.73) 
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In the Newtonian limit p< p and M «<r this goes over to (with units restored) 


Op GM 
—_ 20.74 
Br PTT > (20.74) 


which is the usual Newtonian equation of spherically symmetric hydrostatic equilibrium. 


Exercise 20.4. Constant density star. Shortly after communicating to Einstein his celebrated solution, 
Schwarzschild (1916) sent Einstein a second letter describing the solution for a constant density star. By 
adjoining the interior solution to his exterior solution, Schwarzschild had a consistent solution with no 
troubling “singularity” at its horizon. 
In a spherically symmetric static spacetime, Einstein’s equations reduce to an equation for the mass M 
interior to r 
OM Arr , (20.75) 
dr 
and to the Volkov-Oppenheimer equation of hydrostatic equilibrium (20.73). 
1. Interior mass. Suppose that the density p is constant. From equation (20.75) obtain an expression for 
the interior mass M as a function of radius r and the density p. [Hint: This is easy.| 
2. Hydrostatic equilibrium. Given your expression for M, show that the Volkov-Oppenheimer equa- 
tion (20.73) rearranges to 


dp J 4rr dr 
= 20.76 
[ (e + p)(p + 3p) o 3 — 8r?p ane 


where pe is the central pressure, where the radius is zero, r = 0. 
3. Solve. Integrate equation (20.76). From the integral evaluated at the edge of the star, where the pressure 
is zero, p = 0, and the radius is the stellar radius, r = R,, argue that 


p+ 3Pe 1 
— 2 . 
ptp \1-2M,/R, ne 


where M, = 47pR? is the total mass of the star. 


3 
4. Limits. From the condition that the central pressure be positive and finite, 0 < pe < oo, deduce that 
2M, 8 
0< <> 20.78 
R <9 (20.78) 


5. Comment. Comment on what equation (20.78) implies physically. |Hint: What is the Schwarzschild 
radius?| 


20.15 Freely-falling dust without shell-crossing 


Another case where the spherically symmetric equations simplify is that of neutral, radially freely-falling, 
pressureless matter, at least as long as shells of matter do not cross each other. Pressureless matter is 
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commonly referred to as “dust” in the literature. The collapse of a uniform sphere of dust was first solved by 
Oppenheimer and Snyder (1939). The formalism of freely falling dust is applied in §20.16 to illustrate the 
formation of a naked singularity. 

It is natural to choose the tetrad frame to be the rest frame of the freely-falling dust. In the dust rest 
frame, the energy flux and pressure vanish, f = p = p, = 0. The geodesic equation for the freely-falling dust 
implies that the proper acceleration vanishes, ho = 0, equation (20.26). 

The equations admit two integrals of motion. The first integral of motion is the interior mass M, which 
equation (20.44a) shows is constant, oM = 0, along the path of the freely-falling dust. 

The second integral of motion is 81, as follows from the second of the 2 Einstein equations (20.62). Since 
the acceleration vanishes, the covariant time derivative coincides with the directed time derivative, Do = ðo- 
The 2 Einstein equations (20.62) are then 


M 
obo = -77 > (20.79a) 


AB: =0. (20.79b) 


The second equation (20.79b) shows that 6, is constant as claimed, an integral of motion along the path of 
the freely-falling dust. The first equation (20.79a), in combination with the definition (20.11) of the interior 
mass M and the constancy of 81, recovers the constancy of M. The definition (20.11) of the interior mass 
M implies that the radial velocity 89 = dr/dr of the freely-falling dust is (the minus sign assumes infalling 


dust) 
Bo = 1/62 -14+2M/r . (20.80) 


Comparing this to the solution u” = dr/dr of radially free-falling particles in a Schwarzschild geometry 
of mass M, equation (7.36), shows that 3, may be interpreted as the energy E per unit mass that the 
freely-falling dust would have if there were no further matter (i.e. the geometry were Schwarzschild) outside 
the radius of the dust. This interpretation of 4, is consistent its earlier interpretation as energy per mass, 
equation (20.52). 

As discussed in §20.11.1, in a free-fall tetrad the lapse a can be set equal to unity everywhere, a = 1. This 
corresponds to setting the time coordinate t equal to, up to a shell-dependent constant, the proper time T 
attached to the freely-falling dust. The relation between time t and radius r along the path of the dust is 
obtained by integrating the equation for Bo = dr/dr, 


t-ty=T= 


dr 
obo i —/B2 —14+2M/r ` ene 


where the proper time 7 is fixed to zero at the time tm when the shell collapses to zero radius. The condition 
that shells of positive density collapse to zero radius without crossing requires that the collapse time tm be 
an increasing function of interior mass M. A parametric solution for the radius r of the freely-falling dust 
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is, with x = 1 — 63, 


KT! sin? («1/21/2) [Gif<1 eo [k/n = sin(«!/?7)| [Pi] <1 
r=2M 4 7°/4 Bı =1 , T=M4 7°/6 A| =1 , 
|k|~+ sinh? (|x]1/2m/2) |i] > 1 |«|~3/? [sinh(]k| n) — kn] [i] > 1 
(20.82) 


where 7 is negative, going to zero as the dust radius r collapses to 0. Bound dust, |81| < 1, reaches a 
maximum radius at |«|!/?n = —r. 

It is possible to consider the situation of outgoing dust inside the horizon, for which ( is negative. However, 
there is a coordinate singularity in the line-element (20.1) at 3, = 0, and care needs to be taken interpreting 
solutions where 3; passes through zero. The coordinate singularity may be removed by transforming to a 
time coordinate different from the free-fall time coordinate. The conclusion is that trajectories with different 
signs of 8; belong to distinct pieces of spacetime that abut along the (6, = 0 trajectory. 

The relation between energy density p and the interior mass M is determined by equation (20.47). The 
initial conditions must be set up to satisfy this equation, but the evolution equations guarantee that equa- 
tion (20.47) holds thereafter. The equation is a constraint equation: it is the Hamiltonian constraint. An 
explicit expression for the proper (centre-of-mass) density p at time t and radius r is 


1 
Arr? Bo Ot/OM |,’ 


where the time t is given as a function of M (and 6\(M)) and r by equation (20.81), t(M,r) = tm + f dr /Bo- 
The proper pressure vanishes, as it must for freely-falling dust. 


p= (20.83) 


Exercise 20.5. Oppenheimer-Snyder collapse. Solve the Oppenheimer and Snyder (1939) problem of 
the spherical collapse of a uniform density sphere of pressureless matter that starts from zero velocity at 
infinity. 


20.16 Naked singularities in dust collapse 


Christodoulou (1984) initiated the study of the formation of naked singularities in spherically symmetric 
collapse of dust. Christodoulou showed that if the collapsing dust were sufficiently centrally concentrated, 
then the point at which the singularity first formed would be visible to the outside world, a “naked” singularity. 
The appearance of naked singularities in spherical collapse of dust is generic, requiring only that the collapsing 
dust be sufficiently centrally concentrated. 

Since the appearance of naked singularities is generic, it suffices to illustrate the situation in a simple case. 
One simplifying assumption is that the dust falls from zero velocity at infinity, so that 6; = 1, in which case 
the infall velocity 8o of dust shells is (with the index on o dropped for brevity) 


B Sbc, (20.84) 
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Figure 20.1 Spacetime diagram illustrating the formation of a naked singularity in self-similar collapse of dust, for 
a = 18, equation (20.86). Infalling (red) lines show trajectories of infalling dust, which are also contours of constant 
interior mass M. Approximately diagonal (black) lines show outgoing and ingoing radial null geodesics. Contours are 
drawn at intervals of factors of 2. A singularity (cyan) forms where the first shell of mass collapses to zero radius. 
The naked singularity is the point at the origin {t,r} = {0,0} where the singularity first forms. The apparent horizon 
(dashed pink line) is the locus of points where outgoing null rays turn around. The true horizon (thick pink line) 
divides outgoing null rays that do not and do reach infinity. In the region of spacetime between the true horizon (thick 
pink line) and the Cauchy horizon (thick green line), outgoing null rays emanate from the naked singularity and extend 
to infinity. The apparent, true, and Cauchy horizons are all straight lines emanating from the naked singularity at the 


origin. 
Integrating equation (20.81) with 6, = 1 gives the relation between the radius r and time ¢ along the 


trajectory of a shell with interior mass M, 


2 
aria = V2M (tu —t) , (20.85) 
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where tm, a function of mass M, is the time at which the shell collapses to zero radius, r = 0. 
A second simplifying assumption is self-similarity (see §20.18). In the present case, self-similar solutions 
occur when the collapse time tm is proportional to the interior mass M, 


tu =aM ; (20.86) 


with a some positive dimensionless constant. Given the self-similar assumption (20.86), the relation (20.85) 
between the radius r and time t reduces to a cubic in the infall velocity 6, 


ue. A 
aß? — —B Bo 0. (20.87) 


Equation (20.87) shows that the infall velocity 8 is constant along lines t/r = constant, that is, along straight 
lines emanating from the origin at {t,r} = {0,0}. The infall velocity 8 varies from 0 at t/r = —oo, to —o0 
at t/r = +00. The line-element is Gullstrand-Painlevé, equation (7.27), with 8 the real negative solution of 
the cubic (20.87). The proper pressure in the tetrad frame is zero (as it should be for dust), and the proper 
density p is 


1 £6 


= —; 20.88 
P Gar? 2 — 3aB3 ” ( ) 
which is positive everywhere. 
Radial outgoing (+) and ingoing (—) null rays passing through the infalling dust follow 
dr 
— = 62M. 4 20.89 
ai (20.89) 
Equation (20.89) for null geodesics can be recast as a differential equation between r and 8, which integrates 
to 
2(8+1)(2- 3a?) dp 
. 20.90 
oe J B(£4— 26 £ 3a83 + 3464) (amao) 


The integrand in equation (20.90) is a rational function of 2, so is integrable in terms of elementary functions. 
Special sets of null geodesics occur where the integrand has poles. At poles, null geodesics follow 8 = constant, 
corresponding to straight lines emanating from the origin. For outgoing null geodesics (+ in equation (20.90)), 
the quartic denominator 4— 28 +3a8? + 3a(* has two real roots at 8 < 0 provided that the positive constant 


a exceeds the threshold value 


26 
at 5V3 x 17.3. (20.91) 


For values of a exceeding the threshold (20.91), there is a naked singularity at the origin. The more negative 
of the two real roots (smaller radius) marks the location of the true horizon, while the less negative (larger 
radius) marks the so-called Cauchy horizon. Radial outgoing null rays inside the true horizon turn around 
and fall to the spacelike singularity, never reaching infinity. Radial outgoing null rays between the true and 
Cauchy horizons propagate from the naked singularity to infinity. 

In mathematics, a Cauchy horizon is defined to be the boundary of predictability. In the present case, 
the naked singularity at the origin is considered to be a source of unpredictability, since the direction in 
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which geodesics emerge from the naked singularity is ambiguous, not determined uniquely by the direction 
of geodesics impinging on it. 

Figure 20.1 is a spacetime diagram that illustrates the formation of a naked singularity in self-similar 
collapse of dust for the case a = 18, which slightly exceeds the threshold (20.91). The roots of the quartic in 
this case are 

B = —0.791475 , t/r = 4.79558 true horizon , 


p= -2 ; t/r =3 Cauchy horizon . jane) 


All outgoing null geodesics inside the true horizon, 8 < —0.791475, in due course turn around and fall to zero 
radius, r = 0. Outgoing null geodesics between the true and Cauchy horizons, —0.791475 < 6B < —%, start at 
the naked singularity at the origin and reach infinity. Outgoing null geodesics outside the Cauchy horizon, 
B> -2, start at zero radius before the singularity has formed, and propagate to infinity. The apparent 
horizon, where outgoing null rays turn around, 8 = —1, occurs at t/r = %. 

The naked singularity in spherical dust collapse has the property that future-directed geodesics can emerge 
from it in some directions but not in others. This is a generic feature of naked singularities in general relativity. 


20.16.1 Are naked singularities important? 


As might be imagined, there is a diversity of opinion regarding the importance of naked singularities in 
general relativity. One school of thought holds that singularities that are hidden behind horizons (clothed 
singularities) have no effect on outside observers, and in that sense do not matter, at least to the outside 
observer. From this perspective naked singularities are important precisely because they can affect an outside 
observer. This seems to me a somewhat anthropocentric point of view. It may be that no human ever falls into 
a black hole; but in the cosmos objects fall into black holes all the time. Singularity theorems, Chapter 18, 
indicate that general relativity fails inside black holes (more generally, wherever a trapped surface has 
formed). The question of what physics replaces general relativity where it fails is profound, regardless of 
whether humans can see it. 

The possible appearance of naked singularities in gravitational collapse offers a potential window to physics 
beyond general relativity. However, the collapse of a real black hole is one of the most violent events in 
observational astronomy, attended by supernovae and gamma-ray bursts. It is moot whether the signal 
from a naked singularity, whatever it might be, would be discernible against the cacophony of astrophysical 
processes. 


20.17 Thin spherical shells 


Sections 20.15 and 20.16 addressed matter that falls freely without shell crossing. Another problem that can 
be solved is that of a thin spherical shell. The shell may have internal pressure, and the spherical spacetime 
in which it falls need not be empty. The thin shell formalism is used in §20.17.1 to explore the evolution of 
a bubble of vacuum energy in empty space, a problem considered by Blau, Guendelman, and Guth (1987). 
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As remarked around equation (20.49), the proper radial volume element in the tetrad frame is not 4rr7dr, 
but rather 4rr?dr/6,. The surface density p, energy flux f, radial pressure f, and transverse pressure P of 
a thin shell are defined to be integrals over the proper radial element dr/§4, 

+t dr A + odr t dr +t dr 
p= P , f= f 4 p=f» ’ asf PL z+ 20.93 
- Pt -` Le Pi z By ( ) 
Minus the surface transverse pressure —p, is called the surface tension. The Einstein equations governing 
the shell are obtained by equating 87 (in units, 87G) times the surface energy-momenta (20.93) to integrals 
of the Einstein tensor (20.35) over the proper volume of the shell. The integrals can be done by inspection: 
any term involving a covariant radial derivative Dı integrates to its argument. The Einstein equations for 


the spherical shell in its own frame are then 


2 
Peal ee STÔ, (20.94a 
7 
2[8o]} 7 
0= “ —=8rf, (20.94b 
0 = 8p , (20.94c 
Bilt 
[ho]t + = 8rp.. (20.94d 
r 
Equation (20.94b) says that the velocity 8o is constant across the shell, and equations (20.94b) and (20.94c 


say that the radial energy flux f and radial pressure p vanish in the shell’s own frame, which makes physical 
sense. 

The Riemann, Ricci, and Einstein tensors are defined in terms of derivatives of the tetrad connections 
Tkim- Unsurprisingly, integrals of these tensors over the shell are expressible in terms of ere t. The set. of 
tetrad connections that are tensors under Lorentz transformations within the shell constitute the extrinsic 
curvature Rim of the shell, defined to be the set of tetrad connections Eim] with middle index the radial 
index 1, 


Kim = [L kim]t - (20.95) 


Recall that in the ADM formalism the extrinsic curvature Ky, = [kom is defined to be the set of tetrad 
connections with middle index the time index 0, equations (17.21). In the ADM case the time axis yo 
is a spatial scalar, and the extrinsic curvature K;,, is therefore a tetrad tensor with respect to spatial 
transformations. In the present case the radial axis yı is a scalar with respect to Lorentz transformations 
within the shell, and the connections [L pim]} with middle index the radial index 1 form a tetrad tensor with 
respect to Lorentz transformations within the shell. The Ricci tensor Rim = f Rkm dr/ßı integrated over 
the shell, with indices k, m running over 0, 2,3, equals minus the extrinsic curvature of the shell, 


Rem = —Keem - (20.96) 
The Einstein tensor Gem integrated over the shell, again with indices k,m running over 0, 2,3, is then 


a n n 


G kir = Rkm = inkemR . (20.97) 
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One can confirm that the Einstein tensor (20.97) recovers the left hand sides of the Einstein equations (20.94a) 
and (20.94d) for the surface density and transverse pressure f and p_. 
The proper mass-energy ñ of the shell is, equation (20.94a), 


+ 2 
A4nréd 
M = Anr?p =| pT Zrel. (20.98) 
= By 
The proper mass-energy 77 of the shell is to be distinguished from the total mass-energy M in the shell, 
+ 21+ 
M = [|M]} a p4rr?dr = s l (20.99) 


the final expression of which follows from the definition (20.11) of interior mass M and the fact that the 
velocity o is constant across the shell, equation (20.94b). The proper mass 77 of the shell is related to the 
interior mass M by 


+ dM 
ea VL 
in agreement with equation (20.98). The ratio M/M of total to proper mass-energy in the shell is 
M _ Wit _ 8 +e 
m  2[6]t 2 
the average 3, of the energies per unit mass By either side of the shell. The energies per unit mass {7 either 
side of the shell are 


—r[Ailz , (20.100) 


= x, (20.101) 


— M M 
i= 5 (i]* = F pri (20.102) 


The definition (20.11) of interior mass, along with the expressions (20.102) for Gy, implies that average 
interior mass M of the shell is 
-_ M+M} r 


M= = 1+ 82 
5 5 (14088 


(87)? + 0") _ r+ 85 = Bi) _ (20.103) 


2 2 8r ` 


The Einstein equation (20.94b) implies that the shell velocity 6o is constant across the shell. The defini- 
tion (20.11) of interior mass implies expressions for the velocity 6o in terms of the interior masses M/* and 


energies per unit mass {7 either side of the shell, and equation (20.103) supplies a second expression for 8o 
in terms of the mean interior mass M and the mean energy per unit mass £,, 


i 2M+ 
-2 2M M 
= (a Sere (20.104b) 


The sign of the velocity Bo is + for outfalling, — for infalling. 
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Evolution equations for the various energies per unit mass (; and for the velocity 69 follow from evolu- 
tion equations for the proper mass 7m of the shell and for the various interior masses. The Einstein equa- 
tion (20.62b) in the centre-of-mass frame, f = 0, is 0061—ho8o = 0, which with the Einstein equations (20.94) 
of the shell implies 


2 
0 = Oo[F1|* — bolho]? = —4n(d9(r) + Bo(p+ 2p.) = —4rr (a + 2° (p +p.)) P (20.105) 
Equation (20.105) implies that the proper mass-energy M of the shell evolves as 


ðo + 8rrp_ Bo =0, (20.106) 


which looks like the first law of thermodynamics in the form O97 + P1 0)A = 0 where A = 4rr? is the proper 
area of the shell. The interior masses M= evolve according to the Einstein equation (20.44a), 


Oo MË + 4rr7p* bo =0. (20.107) 


The two equations (20.107) may be recast as evolution equations for the total mass WM = [M]* of the shell 
and for the average interior mass M, 


oM + 4rr?[p|* Bo = 0, (20.108a) 
oM + 4rr7p By =0, (20.108b) 


where [p]* and p = 4 (p~ + p*) are respectively the difference and average of the external radial pressures 
p= on the shell. The evolution (20.106) of the proper mass-energy M of the shell depends on its equation 
of state p/p, while the evolution (20.108) of the total mass-energies M and M depends on the external 
pressures p~. 

Usually it is most straightforward to solve the evolution equations (20.106) and (20.108) for the various 
masses 772, M, and M, and then to infer the energies per unit mass Gy and their average Bı from equa- 
tion (20.102), and the velocity Bo either any of the two equivalent equations (20.104). However, evolution 
equations for 6o, 67, and B, can be deduced directly, either from the evolution equations for the masses, or 
from the Einstein equations (20.62), 


M= 
0080 = prho — ~r — 4arp™ (20.109a 
.. Mm orp 
= Biko — n i e, (20.109b 
4r r 
bT = Boho , (20.109¢ 
e M - 
008, = Oo = Boho , (20.109d 
m 


where họ are proper accelerations experienced by observers in the tetrad frame on each side of the shell, 
and ho is their average, 


[p|= , 2B 1PL 
p rp 


he = ho + 2n(6+ 261), ho=- (20.110) 
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Exercise 20.6. Free fall of a thin, pressureless, spherical shell in vacuo. Solve for the evolution of 
a thin, pressureless, spherical shell that free falls in vacuo from rest at infinity. This exercise provides the 
mathematics behind the calculations reported in 87.28. 

Solution. In the particular case of a pressureless shell, 6, = 0, freely falling in vacuo, p~ = pt = 0, the 
evolution equations (20.106) and (20.106) imply that the proper, total, and mean interior masses ñ, M, and 
M are all constants. The constancy of 77 and M implies the constancy of B,, equation (20.101) (but not 
of BF, equation (20.102)). The infall velocity 8o is given by equation (20.104b). If the shell free falls from 
rest at infinity, then 8; = 1, and the total mass-energy of the shell equals its proper mass-energy, M = ñ, 
equation (20.101). The infall velocity is 


2M M2 
+ 


r Ar? ` 


(20.111) 


If the spherical shell is falling towards an object of mass M, (a black hole, say), then the interior masses 
inside and outside the shell are M7 = Me and M+ = Me +M, and the mean interior mass is M = Me + 1M. 


20.17.1 A bubble of vacuum 


Blau, Guendelman, and Guth (1987) explored the scenario of a spherically symmetric bubble of positive 
vacuum energy density pa that evolves in otherwise empty space. As chronicled by Merali (2017), Blau et al. 
were motivated at least in part by the question of what might happen to a mote of vacuum energy that 
was somehow created in empty space. Could such a mote develop into an inflating universe? If so, would 
the new universe expand out and destroy the surrounding space? Or would the new universe create its own 
spacetime? 

The geometry is de Sitter inside the bubble, Schwarzschild outside. The interface between the de Sitter 
and empty spaces cannot itself be empty, because the finite pressure of the vacuum and the zero pressure 
of empty space do not balance. For simplicity, Blau et al. modelled the interface as a thin spherical shell, 
which they assumed itself had a vacuum equation of state, 6; = —/, a so-called domain wall. The interior 
masses M~ inside (—) and outside (+) the shell, and the proper mass 7 of the shell, are then 


M- =4nr'p,, m=4ar7p, Mt=M, (20.112) 


where the vacuum density pa, shell density 6, and the mass M are all constants. The mass M is the mass 
of the bubble perceived by an observer in the empty space outside the bubble. Equations (20.102) for the 
energy per unit mass 6;° inside (—) and outside (+) the shell, and their average 84, become 


M — rr? (pa + 61") 
Arr? p 


2 M — frr’ 
By = E ee (20.113) 


A4nr?p 
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The velocity 69, equation (20.104), is 


bo = \/ (BFP — AF , (20.114) 
where A* = 1-2M*=/r is the horizon function either side of the shell. The energies per unit mass 67 and 
8, are respectively zero at radii rj and rı given by 

p M a M \¥8 
e= aa se Ek 


For positive mass M and vacuum density pa, the radii rI and rı are always positive. The radius rj, is 
positive or negative as pa is larger or smaller than 67/7. Blau et al. argue that if the vacuum is GUT scale, 
then it might be expected that pa ~ Mowe and 6 ~ mur in Planck units, in which case p/p, ~ méyr, 
which is small compared to 1 if the GUT scale is significantly smaller than the Planck scale, maur < 1. In 
that case all of rf and rı are positive, and they are ordered 


Gar SiS - (20.116) 


Blau et al. introduce a dimensionless variable z = aie in terms of which the energy per unit mass Br, 
equation (20.113), is 


Ian (=) (20.117) 


zZ 


and the velocity 6o, equation (20.114), satisfies 
- ER +V =E, (20.118) 


where V (z) is a dimensionless effective potential and the constant F is an effective dimensionless energy, 


1—23\? H 
V= ( - ) , (20.119a) 
z z 
ye 2/3 
E=- 20.119b 
(i) , (PDR 
with the constant u given by 
247 p? 
w= (20.120) 
pa + 67? 


If M, pa, and fare all positive, then the constant u is positive, while V and FE are negative. Equation (20.118) 
agrees with equation (5.9) of Blau et al. with the translations (there > here) 


Bo>B,, Bs>BS, portpr, X> rpm, You, o>. (20.121) 


The effective potential V defined by equation (20.119a) is a hill that goes through a maximum at a value 
Z = Zmax that depends on pu. Equivalently, p depends on zmax. Figure 20.2 illustrates the effective potential V 
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Figure 20.2 Effective potential V, equation (20.119a), of a spherical shell sandwiched between a bubble of vacuum 
and empty space, as a function of the dimensionless radius z = r/r], for the case zmax = 1.092. A more realistic 
case would have zmax closer to 1, but the larger choice of zmax brings out the behaviour more clearly. The arrowed 
horizontal line is an illustrative unbound trajectory of a shell that expands from zero radius to infinity. Unbound 
trajectories occur for M > Merit. The radii where the trajectory passes through the de Sitter horizon A~ = 0, the 
Schwarzschild horizon A+ = 0, and the places where the energies per unit mass 8; pass through zero, are marked. 
The choice zmax = (1 — V5/24+ v 17/4 5) = 1.092 is a special value for which there happens to be a special 


trajectory, the one shown, where the locations A~ = 0 and BI = 0 coincide, and also the locations A+ = 0 and 
6I = 0 coincide. This is similar to Figure 6 of Blau, Guendelman, and Guth (1987). 


for the case Zax = 1.092. The value of the constant u, and of the potential Vmax = V (Zmax) at its maximum, 
are 
(22. —1)(z3 2 3(z6, — 1 
u == (Zia 1) (max + ) ; Vaaz Bm Cmax ) . (20.122) 
Zmax ŽZmax 


As 6T? /pr varies from 0 to 1 to oo, the constant pu varies from 0 to 2 to 4, and the apex Zmax of the potential 
varies from 1 to 21/6 to 21/3. The motion of the shell is bounded if E < Vmax, unbounded if E > Vmax. The 
critical case E = Vmax occurs at a mass M = Merit, 


(2 = Fs) (zaz T 2)3 
72m px (Znax + 1)? 


Merit = (20.123) 
If M < Mait, then the motion is bounded, while for M > Mert the motion is unbounded. For vacuum 
densities sufficiently below the Planck scale, where p/ pa <1 and hence zmax œ 1, the critical mass is 
Merit © \/3/(327pq), or about 6 grams for Maur ~% 10° GeV. 

Blau et al. were interested in the fate of a mote of vacuum that materializes at small radius and initially 
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expands. If the mass of the mote is less than the critical mass Merit, then the mote momentarily expands, 
but then turns around and collapses. No new universe. 

If on the other hand the mass of the mote exceeds the critical mass Merit, then the mote expands from 
zero radius to infinity. A new universe is created. 

From the perspective of an observer in the pre-existing empty space, the shell materializes at zero radius, 
r = 0, with zero proper mass, ™ = 0, but with finite total mass M = M, hence infinite energy per unit 
mass. The outside observer sees a white hole of mass M and horizon size 2M suddenly come into being. The 
shell is inside the White Hole part of the Schwarzschild geometry, Figure 7.17. The outside observer, in the 
Universe part of the Schwarzschild geometry, sees the shell born at the white hole singularity, possibly with 
attending fireworks, and watches the shell expand into the empty space inside the white hole (the contents 
of a white hole are, unlike a black hole, visible to an outside observer). The shell switches from ingoing 
(BT > 0) to outgoing (BT < 0) inside the white hole, and, now having negative energy per unit mass f, 
exits into the Parallel Universe part of the Schwarzschild geometry. The observer in the Universe sees the 
exiting shell redshift and dim to obscurity. 

The more interesting perspective is that of an observer who rides with the shell. The shell does not 
expand into a pre-existing spacetime, but rather creates its own new spacetime, with both empty and de 
Sitter components. The shell can be conceptualized in one lower dimension as the leading circular edge 
of an expanding two-sided disk, on the one side of which is empty space, and on the other is de Sitter 
space. Looking backwards, the shell observer sees empty space at smaller radii, going back to the white hole. 
Looking forwards, the shell observer sees de Sitter space also at smaller radii. The forward looking observer 
is looking in the direction where the radius should be larger, but because the spherical shell is expanding 
faster than light (A7 < 0) away from the origin of de Sitter space at r = 0, any light that the shell observer 
sees necessarily comes from behind them, at smaller radius. 

An observer at the origin r = 0 of de Sitter space sees the shell expand away from them. Either before or 
shortly after passing through the White Hole horizon into the Parallel Universe, the shell expands beyond 
the de Sitter horizon of the observer at the origin r = 0. The origin observer truly finds themself in an 
inflating universe. 

Key to this remarkable behaviour is the transition of the shell’s total mass M=M-— amr? pa from positive 
to negative, which happens between the times that the shell passes through the White Hole and de Sitter 
horizons. Does a large negative total shell mass M make sense? Recall that the total mass M includes not 
only rest mass but also kinetic and gravitational contributions, and the gravitational contribution can be 
negative. The proper mass 77 = 4rr?p of the shell is always positive (and increasing). The mass Snr? px of 
vacuum energy grows huge as the bubble expands, a mass balanced by the negative gravitational total mass 
of the shell. 

Is the creation of a bubble of vacuum from a white hole singularity realistic? Nope. 


20.17.2 A bubble of vacuum from a magnetic monopole 


Sakai et al. (2006) argue that a more realistic origin for an inflating universe is a Grand Unified Theory 
(GUT) magnetic monopole (’t Hooft, Gerard, 1974; Polyakov, 1974). Magnetic monopoles are predicted by 
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Figure 20.3 Effective potential V, equation (20.126), of a magnetically charged shell enclosing a bubble of positive 
energy vacuum. The parameters are given by equation (20.127). Both configurations have an extremal RN geometry 
Q = M. On the left, the shell (dot) is in a classically stable configuration. On the right, the vacuum density is 
somewhat larger, and the configuration has marginal stability. Thick horizontal bands show the radial ranges where 
By (pinkish) and 67 (greenish) are positive. Vertical dashed lines mark RN (red) and de Sitter (green) horizons r+ 
and r7. 


GUTs, where the electromagnetic field gets knotted up in spacetime. GUT monopoles are predicted to have 
masses approximately a~! = 137 times the GUT mass, or about 10!8 GeV, close to the Planck mass. No 
monopole has been observed in Nature, but that is not too surprising given their large mass. 

Sakai et al. model the scenario using the thin shell formalism, with Reissner-Nordstr6m geometry outside 
the shell, de Sitter inside. The parameters of the RN geometry are the mass M and magnetic charge Q of 
the magnetic monopole. The de Sitter geometry has positive vacuum density pa. The shell carries all the 
magnetic charge of the monopole, so the monopole looks charged from the RN side, uncharged from the de 
Sitter side. Sakai et al. model the shell as having a constant mass ig attributable to the rest mass of its 
magnetic charge, plus a constant vacuum shell density Ja. The interior masses M= inside (—) and outside 
(+) the shell, and the proper mass 77 of the shell, are then 


2 
M~ =$4nr°p,, m=mqt4nr?p,, Mt=M-— a 


20.124 
— (20.124) 


where M, Q, pa, ñigo, and fy are all constants. The translation from Sakai et al.’s notation is (there — here) 
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Mo 
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Be > bis p>pr, m>, o > po = (20.125) 
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An effective potential V for the shell may be defined by 


V =—62 = —(6F)? +1 , (20.126) 


with energies per unit mass $7 given by equation (20.102). 

The interesting parameter regime is where the RN geometry is near extremal, Q ~ M. Parameters may 
be chosen such that the effective potential V, equation (20.126), has a stable or marginally stable point at 
zero velocity 6o, as illustrated in Figure 20.3. The left panel of Figure 20.3 shows a stable point; the right 
panel a marginally stable point. The parameters of the two cases illustrated are 


Ss 


Q=M, H=\/8rpra= M, ño=] 2 M, py=0. (20.127) 


/3 
4 2 

Both examples are for an extremal RN geometry, Q = M, and in both cases the point of (marginal) stability 
is at the RN horizon. The first, stable, choice (left panel of Figure 20.3) is special in that not only 69 but 
also AŤ vanishes at the point of stability. The second, marginally stable, choice (right panel of Figure 20.3) 
is special by virtue of its marginal stability. 

The initial configuration illustrated in the left panel of Figure 20.3 is classically stable. An outside observer 
sees a magnetic monopole with magnetic charge Q equal to its mass M. The shell is located at the horizon 
of the extremal RN geometry. Inside the shell is vacuum with positive energy pa. 

The shell could potentially quantum tunnel out of the stable configuration, or alternatively the monopole 
could perhaps be perturbed out of its stable state by a collision of some kind. Or, the parameters might 
perhaps be tuned so that the configuration is close to or at marginal stability, as illustrated in the right 
panel of Figure 20.3. 

Once out of the stable or marginally stable configuration, the shell starts expanding. In both cases shown 
in Figure 20.3, the RN energy per unit mass Bt starts at zero in the initial (marginally) stable configuration. 
More generally, Br can be initially positive or negative. But regardless of the initial sign, By becomes negative 
as the shell expands, indicating that the shell has made its way to a Parallel Universe or Parallel Antiverse 
part of the RN geometry, Figure 8.6. As the shell expands, it exits the de Sitter horizon of an observer at 
r=0. 

As in the situation of a bubble of vacuum in empty space considered by Blau, Guendelman, and Guth 
(1987), the shell does not expand into a pre-existing spacetime, but rather creates its own new spacetime, 
with both RN and de Sitter components. Looking backward, an observer riding the shell sees RN spacetime 
at smaller radii. Looking forward, an observer riding the shell sees de Sitter spacetime, also at smaller radii. 
Even though the forward-looking observer is looking in the direction of larger radii, they see only smaller 
radii because the shell is moving superluminally outward outside the de Sitter horizon of the origin at r = 0. 

How realistic is the scenario of the creation of an inflating universe from a GUT magnetic monopole? An 
object moving outwards in radius with negative RN energy per unit mass BT is necessarily in a Parallel 
part of the RN geometry, and must have negotiated an inner horizon where the outside universe appeared 
infinitely blueshifted. In the extremal cases illustrated in Figure 20.3, the inner and outer horizons coincide, 
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and an object at rest at the horizon sees the outside universe infinitely blueshifted. In realistic situations, 
the diverging concentration of energy at the inner horizon drives an instability that is the principal topic of 
Chapter 21. Bottom line: the model is not realistic as it stands. 


20.18 Self-similar spherically symmetric spacetime 


A fourth way to simplify the system of spherically symmetric equations, transforming them into ordinary 
differential equations, is to consider self-similar solutions. The system is more complicated than that of a 
static system, or of freely-falling dust, or of thin shells, but still straightforward. 

Self-similar solutions are flexible enough to admit multiple components of energy-momentum, which may 
interact with each other. Self-similar solutions are especially useful for exploring the inflationary instability 
in the vicinity of the inner horizon of a charged spherical black hole, considered in the next Chapter 21. 
Charged spherical black holes are not realistic as models of real astronomical black holes, but they have 
inner horizons like realistic rotating black holes, so admit inflation. 


20.18.1 Self-similarity 


The assumption of self-similarity (also known as homothety, if you can pronounce it) is the assumption 
that the system possesses conformal time translation invariance. This implies that there exists a conformal 
time coordinate t such that the geometry at any one time is conformally related to the geometry at any other 
time, guv = e° ĝuu, where the conformal metric coefficients g,,, (r) are functions only of conformal radius r, 
not of conformal time t. In terms of conformal coordinates x” = {t,r,6, b}, the self-similar line-element is 


ds? = e?” [Gut (r) dt? + 2 ğer (r) dt dr + ĝrr(r) dr? + e*" do” | . (20.128) 


The choice e?” of the coefficient of do? is a gauge choice of the conformal radius r, chosen here so as to 
bring the self-similar line-element into a form (20.132) below that resembles as far as possible the spherical 
line-element (20.1). The proper circumferential radius R is 


R= e+" (20.129) 


which is to be considered as a function R(t,r) of the conformal coordinates t and r. The circumferential 
radius R has a gauge-invariant meaning, whereas neither t nor r are independently gauge-invariant. The 
conformal factor R has the dimensions of length. In self-similar solutions, all quantities are proportional to 
some power of R, and that power can be determined by dimensional analysis. Quantities that depend only 
on the conformal radial coordinate r, independent of the circumferential radius R, are called dimensionless. 

The fact that dimensionless quantities such as the conformal metric coefficients ĝu (r) are independent of 
conformal time t implies that the tangent vector e+, which by definition satisfies 


Lee (20.130) 
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is a conformal Killing vector, §7.32.4, also known as the homothetic vector. The tetrad-frame components of 
the conformal Killing vector e; defines the tetrad-frame conformal Killing 4-vector £”, 


a ROO, (20.131) 


in which the factor R is introduced so as to make €™” dimensionless. The conformal Killing vector e, is 
the generator of the conformal time translation symmetry, and as such it is gauge-invariant (up to a global 
rescaling of conformal time, t —> at for some constant a). It follows that its dimensionless tetrad-frame 
components €” constitute a tetrad 4-vector (again, up to global rescaling of conformal time). 


20.18.2 Self-similar line-element 


The self-similar line-element can be taken to have the same form as the spherical line-element (20.1), but 
with the dependence on the dimensionless conformal Killing vector ¿™ made manifest: 


1 
ds? = R? |— (€° dt)? + z (dr + By Edt)? + do?} . (20.132) 
1 
The vierbein e™,, and inverse vierbein em” corresponding to the self-similar line-element (20.132) are 
£ 0 0 2 1/6 —B,€'/€° 0 0 
m a 1/61 0 0 T 1 0 By 0 0 
eee ae ee g r 7 m FR| 0 0 1 0 tee 
0 0 0 sin 0 0 0 1/sin0 


It is straightforward to see that the coordinate time components of the vierbein must be e™, = R&™, since 
0/dt = e™; Om equals RE” 3m, equation (20.131). 


20.18.3 Tetrad-frame scalars and vectors 


Since the conformal factor R is gauge-invariant, the directed gradient ôm R constitutes a tetrad-frame 4-vector 
Bm (which unlike €™ is independent of any global rescaling of conformal time), 


Bm = mR. (20.134) 


It is straightforward to check that 6, defined by equation (20.134) is consistent with its appearance in the 
vierbein (20.133) provided that R œ e” as earlier assumed, equation (20.129). 

With two distinct dimensionless tetrad 4-vectors in hand, Bm and the conformal Killing vector €™, three 
gauge-invariant dimensionless scalars can be constructed, 8™ Bm, €' Bm, and E Em, 


2M 
1- = = BB = — BE +82 , (20.135a) 


10R 
v= E” Bm = El bo + EB = Ro’ 


A=-£™6, = (E) — (E). (20.135¢) 


(20.135b) 
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The M in equation (20.135a), which is essentially the same as equation (20.11), is the interior mass. Equa- 
tion (20.135a) is dimensionless, which implies that the interior mass at fixed conformal radius r increases 
in proportion to the conformal factor, M œ R. The dimensionless constant v in equation (20.135b) may be 
interpreted as a measure of the expansion velocity of the self-similar spacetime. Because of the freedom of 
a global rescaling of conformal time, it is possible to set v = 1 without loss of generality; but that scaling 
obscures the physical significance of v as an expansion rate. The choice adopted in the next Chapter, equa- 
tion (21.7), is to set v equal to the rate M, of increase of the interior mass M evaluated at a specific conformal 
radius, taken to be the sonic point outside the horizon where the boundary conditions are established; the 
rate is with respect to the proper time 7, of collisionless “dark matter” that free-falls radially from zero 
velocity far from the black hole, 


: dMsonic 
v= M.e = ; 


20.136 
dTa ( ) 


The proper time 7, is essentially the free-fall time tg of the Gullstrand-Painlevé line-element (19.10), or 
equivalently T in the line-element (20.139) with a = 1 and 6, = 1. The dimensionless quantity A in 
equation (20.135c) is the dimensionless horizon function: horizons occur where the horizon function vanishes, 


A=0_ at horizons . (20.137) 


Note that if v is rescaled, then A œ v?. 


Exercise 20.7. Self-similar line-element. Let T and R denote time and radius coordinates 
T=e", Raze, (20.138) 


Show that the self-similar line-element (20.132) in terms of T and R is 


1 
ds? = — a2dT? + A (dR — By dT)? + R?do? , (20.139) 
1 
with lapse 
0 
es ae (20.140) 


vT ` 
The line-element (20.139) is the same as the spherical line-element (20.1) with t and r in the latter relabelled 
T and R. 


20.18.4 Self-similar diagonal line-element 


The self-similar line-element (20.132) can be brought to diagonal form by a coordinate transformation to 
diagonal conformal coordinates t, r, (subscripted x for diagonal), 


tot, =t+f(r), ror, =r—-vfi(r), (20.141) 
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which leaves unchanged the conformal factor R, equation (20.129). The resulting diagonal metric is (compare 
equation (20.19)) 


2 
ds? = R? ( A dt®. 4 = we A + i?) : (20.142) 
The diagonal line-element (20.142) corresponds physically to the case where the tetrad frame is at rest in 
the similarity frame, £' = 0, as can be seen by comparing it to the line-element (20.132). The frame can be 
called the similarity frame. The form of the metric coefficients in the line-element (20.142) follows from 
the line-element (20.132) and the gauge-invariant scalars (20.135). 
The conformal Killing vector in the similarity frame is €™ = {/VA,0,0,0}, and the 4-velocity of the 
similarity frame in its own frame is u™ = {1,0,0,0}. Since both are tetrad 4-vectors, it follows that with 
respect to a general tetrad frame (20.132), 


em = u” VA (20.143) 


where u™ is the 4-velocity of the similarity frame with respect to the general tetrad frame. This shows that 
the conformal Killing vector €’ in a general tetrad frame is proportional to the 4-velocity of the similarity 
frame through the tetrad frame. In particular, the proper 3-velocity of the similarity frame through the 


tetrad frame is 
1 
proper 3-velocity of similarity frame through tetrad frame = a : (20.144) 


£0 
In the models considered in Chapter 21, fluids generically fall inward into the black hole. The velocity of the 
tetrad rest frame of an infalling fluid is negative relative to the similarity frame, so the velocity €1/£° of the 
similarity frame through the tetrad frame is positive. 

In the rest frame of any fluid, the Killing vector €™ remains finite and continuous across horizons, where 
A = 0, whereas the related 4-velocity u”, equation (20.143), diverges at horizons. The infall velocity hits the 
speed of light at the outer horizon, €1/€° = 1, both ¿t and €° remaining positive there (while u™ diverges). 
Inside the horizon, the conformal Killing vector €” becomes timelike, with positive €' exceeding €°. In some 


models, the fluid later drops through an outgoing inner horizon, where £1/€° = —1 with ¿t positive and €° 
8 going 
negative. In general, €” is lightlike at horizons, 
gl 
eo =1 ata horizon. (20.145) 


20.18.5 Ray-tracing line-element 


It proves useful to introduce a “ray-tracing” conformal radial coordinate x related to the coordinate r, of 
the diagonal line-element (20.142) by 
Adry 


r= (12 aM/R)A+ ae : (20.146) 
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In terms of the ray-tracing coordinate x, the diagonal metric (20.142) is 
2 2 2 _ dx” 2 


The line-element (20.147) defines the same similarity tetrad frame as (20.142). 


20.18.6 Geodesics 


Spherical symmetry and conformal time translation symmetry imply that geodesic motion in spherically 
symmetric self-similar spacetimes is described by a complete set of integrals of motion. 

The integral of motion associated with conformal time translation symmetry can be obtained from La- 
grange’s equations of motion, 


with effective Lagrangian L = $ Guvuu” for a particle with coordinate 4-velocity u”. The self-similar metric 
depends on the conformal time t only through the overall conformal factor gy x R°. The derivative of the 
conformal factor is given by ln R/Ot = v, equation (20.135b), so it follows that OL/Ot = 2vL. For a massive 


particle, for which conservation of rest mass implies g,,,u/'u” = —1, Lagrange’s equations (20.148) thus yield 
du; 
—=-v. 20.149 
dt L ( ) 


In the limit of zero accretion rate, v > 0, equation (20.149) would integrate to give u, as a constant, the 
energy per unit mass of the geodesic. But here there is conformal time translation symmetry in place of time 
translation symmetry, and equation (20.149) integrates to 


Ut =-VT, (20.150) 


in which an arbitrary constant of integration has been absorbed into a shift in the zero point of the proper 
time T. Although the above derivation was for a massive particle, it holds also for a massless particle, with the 
understanding that the proper time 7 is constant along a null geodesic. The quantity u+ in equation (20.150) 
is the covariant time component of the coordinate-frame 4-velocity u” of the particle; it is related to the 
covariant components Um of the tetrad-frame 4-velocity of the particle by 


Up = ei Um = RE Um . (20.151) 


Without loss of generality, geodesic motion can be taken to lie in the equatorial plane 6 = 7/2 of the 
spherical spacetime. The integrals of motion associated with conformal time translation symmetry, rotational 
symmetry about the polar axis, and conservation of rest mass, are, for a massive particle, 


U=—vT, ug=L, upu =-l, (20.152) 


where L is the orbital angular momentum per unit rest mass of the particle. The coordinate 4-velocity 
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u” = dx"/dr that follows from equations (20.152) takes its simplest form in the conformal coordinates 
{t,.,v, 0, o} of the ray-tracing metric (20.147), 


VT * 1 1/2 
ux =, u= tz [v?7? — (R? + L?)A] ME OSes (20.153) 


20.18.7 Null geodesics 


The important case of a massless particle follows from taking the limit of a massive particle with infinite 
energy and angular momentum, vr — oo and L —> oo (note that 7 is constant along a null geodesic, and 
vr can be treated as constant in the limit of a massive particle of infinite energy). To obtain finite results, 
define an affine parameter \ by dà = vr dr, and a 4-velocity in terms of it by v” = dz” /dà. The integrals of 
motion (20.152) then become, for a null geodesic, 


Ct =l » Vb = J , vu” =0 ; (20.154) 


where J = L/(vr) is the (dimensionless) conformal angular momentum of the particle. The 4-velocity v” 
along the null geodesic is then, in terms of the coordinates of the ray-tracing metric (20.147), 


1 ; 1 1/2 J 
t d 2 o _ 
= RA 3 v= Ep2 (1 -J A) ; U = R2 . (20.155) 
Equations (20.155) yield the shape of a null geodesic by quadrature, 
J dz 
=) r i: 20.156 
b= | oam (20.156) 


Equation (20.156) shows that the shape of null geodesics in spherically symmetric self-similar spacetimes 
hinges on the behaviour of the dimensionless horizon function A(x) as a function of the dimensionless 
ray-tracing variable x. Null geodesics go through periapsis or apoapsis in the self-similar frame where the 
denominator of the integrand of (20.156) is zero, corresponding to v” = 0. 

In the Reissner-Nordstr6m geometry there is a radius, the photon sphere, where photons can orbit in circles 
for ever. In non-stationary self-similar solutions there is no conformal radius where photons can orbit for ever 
(to remain at fixed conformal radius r, the photon angular momentum would have to increase in proportion 
to the conformal factor R). There is however a separatrix between null geodesics that do or do not fall 
into the black hole, and the conformal radius where this occurs can be called the photon sphere equivalent. 
The photon sphere equivalent occurs where the denominator of the integrand of equation (20.156) not only 
vanishes, v” = 0, but is an extremum, which happens where the horizon function A is an extremum, 


dA 


an 0 at photon sphere equivalent . (20.157) 
xv 
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20.18.8 Dimensional analysis 


The spatial conformal coordinates {r,0,¢} are by definition dimensionless. The tetrad metric yn, is dimen- 
sionless, while the coordinate metric g,,, scales as R?, 


mn x R’, gw OT . (20.158) 
The vierbein e™,, and inverse vierbein em” equations (20.133), scale as 

e”, xR, Bea een (20.159) 
The tetrad connections lmn and the tetrad-frame Riemann tensor Rkimn scale as 


Pkmn xX R ’ Rkimn xX R? . (20.160) 


20.18.9 Variety of self-similar solutions 


Self-similar solutions exist provided that the properties of the energy-momentum introduce no additional 
dimensional parameters. Dimensional analysis shows that the proper density p and radial and transverse 
pressure p and p, of any species must scale with conformal factor R as 


pxpxpi x R°’. (20.161) 


The pressure-to-density ratio w = p/p of any species is dimensionless, and since the ratio can depend only 
on the nature of the species itself, not for example on where it happens to be located in the spacetime, it 
follows that the ratio w must be a constant. It is legitimate for the pressure-to-density ratio to be different in 
the radial and transverse directions (as it is for a radial electric field), but otherwise self-similarity requires 
that 


w=p/p, wi =pP1/P, (20.162) 


be constants for each species. For example, w = 1 for an ultrahard fluid (which can mimic the behaviour of 
a massless scalar field (Babichev et al., 2008)), w = 1/3 for a relativistic fluid, w = 0 for pressureless cold 
dark matter, w = —1 for vacuum energy, and w = —1 with w, = 1 for a radial electric field. 

Self-similarity allows that the energy-momentum may consist of several distinct components, such as a rel- 
ativistic fluid, plus dark matter, plus an electric field. The components may interact with each other provided 
that the properties of the interaction introduce no additional dimensional parameters. Dimensional analysis 
shows that the flux F” of energy and momentum transferred between any two species, equation (20.57), 
must scale as 


Paks (20.163) 


20.18.10 Electrical conductivity 


The principal reason to consider charged black holes is that stationary charged black holes have inner horizons 
like rotating black holes, and it is easier to model spherical charged black holes than rotating black holes. 
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The big question, explored using spherical charged black holes in the next Chapter 21, is what happens near 
their inner horizons? In exploring this question one should bear in mind that charge is really a surrogate for 
rotation. 

In self-similar models, a charged black hole acquires its electrical charge from accretion of charged fluid. A 
charged fluid will experience a Lorentz force from the electric field, and will therefore exchange momentum 
with the electric field. If the fluid is non-conducting, then there is no dissipation, and the interaction between 
the charged fluid and electric field automatically introduces no additional dimensional parameters. However, 
if the charged fluid is electrically conducting, then the electrical conductivity of the fluid could potentially 
introduce an additional dimensional parameter, and this must not be allowed if self-similarity is to be 
maintained. Dimensional analysis shows that the electric charge density q = 7°, the radial electric current 
j = j!, and the radial electric field E = Q/R? scale as 


qxjxR?, ExR?, (20.164) 


consistent with the requirement that the flux of energy and momentum on the right hand sides of equa- 
tions (20.70) scale as F” x R73. In diffusive electrical conduction in a fluid of conductivity c, an electric 
field E gives rise to a current in the fluid rest frame, 


j=0oE, (20.165) 


which is just Ohm’s law. Dimensional analysis then requires that the conductivity must scale as o x Ro}. 
The conductivity can depend only on the intrinsic properties of the conducting fluid, and the only intrinsic 
property available is its density, which scales as p x R~?. It follows that the conductivity must be proportional 
to the square root of the density p of the conducting fluid, 


o=Kp?, (20.166) 


where « is a dimensionless conductivity constant. The form (20.166) is required by self-similarity, and is 
not necessarily realistic (although it is realistic that the conductivity increases with density). However, the 
conductivity (20.166) is adequate for the purpose of exploring the consequences of dissipation in simple 
models of black holes. 

A realistic value of the electrical conductivity of a baryonic plasma at a relativistic temperature T is 
(Arnold, Moore, and Yaffe, 2000) 

C kT 
elme! h 
where e is the dimensionless charge of the electron, the square root of the fine-structure constant, and the 
factor C ~% 15 depends on the mix of particle species. This electrical conductivity is huge. A dimensionless 
measure of the conductivity (which has units 1/time) is the conductivity o times the characteristic timescale 
tsu = GM/c? of the black hole, which is of order 


(20.167) 


g = 


T 
otBgH oa (20.168) 
TBH 


where kTpy = ñ/tgu is the characteristic temperature of the black hole (for a Schwarzschild black hole, 
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this characteristic temperature Tpy is 8m times the Hawking temperature). In the astronomical situation 
considered here the temperature T of the plasma is huge compared to the characteristic temperature Tpy of 
the black hole. Indeed if this were not so, then mass loss by Hawking radiation would tend to compete with 
mass gain by accretion, an entirely different situation from the one envisaged here. 

Charge is being envisaged here as a surrogate for rotation, and electrical conduction should be interpreted 
as a substitute for angular momentum transport. Angular momentum transport is a much weaker process 
than electrical conduction (if angular momentum transport were as strong as electrical conduction, then 
accretion disks would shed angular momentum as quickly as they shed charge, and accretion disks would not 
rotate). In the next Chapter 21, the conductivity is treated as a phenomenological free parameter, greatly 
suppressed compared to any realistic conductivity, but nevertheless possibly consistent with what might be 
a reasonable rate for the analogous angular momentum transport in a rotating black hole. 


20.18.11 Tetrad connections 


The expressions for the tetrad connections for the self-similar spacetime (20.132) are the same as those (20.23) 
for a general spherically symmetric spacetime. Expressions (20.24) and (20.25) for the proper radial acceler- 
ation ho and the radial Hubble parameter hı translate in the self-similar spacetime to 


ho =O: In(RE°), hi = Ooln(RE'). (20.169) 


Comparing equations (20.169) to equations (20.24) and (20.29) shows that the lapse a and scale factor A 
translate in the self-similar spacetime to 


o> RE, AS RE. (20.170) 


20.18.12 Spherical equations carry over to the self-similar case 


The tetrad-frame Riemann, Weyl, and Einstein tensors in the self-similar spacetime take the same form as 
in the general spherical case, equations (20.30)—(20.35). 

Likewise, the equations for the interior mass in §20.9, for energy-momentum conservation in §20.10, for 
the first law in §20.10.1, and the various equations for the electromagnetic field in $20.13, all carry through 
unchanged. 


20.18.13 From partial to ordinary differential equations 


The central simplifying feature of self-similar solutions is that they turn a system of partial differential 
equations into a system of ordinary differential equations. 

By definition, a dimensionless quantity A(r) is independent of conformal time t. It follows that the partial 
derivative of any dimensionless quantity A(r) with respect to conformal time t vanishes, 


_ dA(r) 


j ot 


= E™OmA(r) = (E3 + E01) A(r) . (20.171) 
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Consequently the directed radial derivative 0, F of a dimensionless quantity A(r) is related to its directed 
time derivative oF by 

E 
es 


Equation (20.172) allows radial derivatives to be converted to time derivatives. 


0, A(r) = —2-O)A(r) . (20.172) 


20.18.14 Integration variable 


It is desirable to choose an integration variable that varies monotonically. A natural choice is the proper 
time 7 in some tetrad frame, since this is guaranteed to increase monotonically. The 4-velocity at rest in 
the tetrad frame is by definition u™ = {1,0,0,0}, so the proper time derivative is related to the directed 
conformal time derivative in the tetrad frame by d/drt = u™ ðm = Oo. 

However, there is another choice of integration variable, the ray-tracing variable x defined by equa- 
tion (20.146), that is not specifically tied to any tetrad frame, and that has a desirable (tetrad and coordinate) 
gauge-invariant meaning. The proper time derivative of any dimensionless function A(r) in the tetrad frame 
is related to its derivative dA/dx with respect to the ray-tracing variable x by 


A= U” AmA = (ud) in A = -> . (20.173) 


In the third expression, (u!0,)simA is u’0,,A expressed in the similarity frame (20.147), where the di- 
rected time and radial derivatives are (0o)sim = (1/(RVA)) 0/0t,. and (0z)sim = (WA/R) 0/Ox. The partial 
time derivative 0/Ot,.|, = 0/Ot|, vanishes acting on any dimensionless quantity A(r). The last expression 
of (20.173) comes from uj,,, = —€1/WA in view of equation (20.143), the minus sign coming from the fact 
that uj,,, is tetrad relative to similarity frame, while ut in equation (20.143) is similarity relative to tetrad 
frame. 

In summary, the chosen integration variable is the dimensionless ray-tracing variable —x (with a minus 
because —x increases monotonically with proper time), the derivative with respect to which, acting on any 


dimensionless function, is related to the proper time derivative Op in any tetrad frame by 


d R 


Equation (20.174) involves £t, which is proportional to the proper velocity of the tetrad frame through the 
similarity frame, equation (20.145), and which therefore, being initially positive, must always remain positive 
in any tetrad frame attached to a fluid, as long as the fluid does not turn back on itself, as must be true for 
the self-similar solution to be consistent. 


(20.174) 


20.18.15 Integrals of motion 


As remarked above, equation (20.171), in self-similar solutions €™0,,A(r) = 0 holds for any dimensionless 
function A(r). If both the directed derivatives 0) A(r) and 0, A(r) are known from the Einstein equations or 
elsewhere, then the result will be an integral of motion. 
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The spherically symmetric, self-similar Einstein equations admit two integrals of motion, 


0 = RE" OmBo = RBi(E°ho + Eth) — €° (3 + inp) + €'40R’ f, (20.175a) 


0 = RE” OmB1 = R Bo(€°ho + €'hi) + Et (4 = inp) +EATR? f. (20.175b) 


Taking €1 times (20.175a) plus ¿° times (20.175b), and then o times (20.175a) minus 6; times (20.175b), 
gives 


0 = VR(E%hg + €1h1) — 4r R? [EPE] (p + p) — ((€°)? + (E)) f] , (20.176a) 
0 = RE" = v + AR? [Biel p — Bok p + (BoE! — 1€) f] - (20.176) 


The quantities in square brackets on the right hand sides of equations (20.176) are scalars for each species z, 
so equations (20.176) can also be written 


VR(E°ho + E11) = 40K? XO BE (pe + pa), (20.177a) 
species x 
M 
vg = ATR? J (Be,182 Px — Br,082Px) , (20.177b) 
species x 


where the sum is over all species x, and rm and £7” are the 4-vectors Bm and €™ expressed in the rest 
frame of species x. Equations (20.177) are scalar equations, valid in any frame of reference. 
For any fluid with equation of state p/p = w = constant, a further integral comes from considering 


0 = RE"On(R7p) = R [w £3 (R? p) + €'01(R?p)] , (20.178) 


and simplifying using the energy conservation equation for Ojo and the momentum conservation equation 
for Oyp. 

In the particular case of the electromagnetic field, equation (20.178) reduces to 
Q Q 


=v 2 + tek? EE (20.179) 


0= RE” ðm = 
é R R 


which is valid in any radial tetrad frame. 
The energy-momentum conservation equations (20.55) with fluxes (20.57) are 


2 

Oop + am +pi)th(pt+p) =F, (20.180a) 
2 

Oyp + Pp -= p1) +ho(p+p) =F". (20.180b) 


If a species is charged, then the energy flux into the charged species from the electromagnetic field is, 
equations (20.70), 


Fo=jE, F'=qE. (20.181) 
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There may be other contributions to the energy-momentum fluxes F’” if the species exchanges energy- 
momentum with another species, for example through collisions. Inserting equations (20.180) into equa- 
tion (20.178) yields, in the centre-of-mass frame of a species, 


(1+ w)R(E1Ag + w&°h1) — 2w (E181 — wE? Bo) = er + wl? F?) . (20.182) 


Equation (20.182) rearranges to 


_ 2wi 8 (E81 — wE? Bo) — w(1 + wb (e/v) + (R/P)E (EF? + wE? F?) 


Auo O+ u) [E — wE 


(20.183) 


where 


e=4rR? XO EGN + We) pe (20.184) 


species x 


summed over all species x (including the one under consideration), where £? is in the rest frame of species x. 


20.18.16 Entropy 


Substituting the self-similar expression (20.170) for the scale factor À into the energy conservation equa- 
tion (20.59) for a species in its own centre-of-mass frame gives 


F° 
dp In Raa (co ale =—. (20.185) 
p 


For a fluid with isotropic equation of state w = w1, equation (20.185) becomes 
F° 


oln S = ———— 
cee Fw 


(20.186) 


where S$ is (up to an arbitrary constant) the entropy of a comoving volume element V œ RE! of the fluid, 


S = RE Cto) (20.187) 


20.18.17 Summary of equations for accreting, self-similar, spherical, charged black 
holes 


This section summarizes the equations used in Chapter 21 to compute the evolution of self-similar, spherical, 
charged black holes accreting a variety of fluids. For brevity, the index x labelling a fluid species is omitted. 
Equations (20.190)-(20.195) and (20.199) are valid in any tetrad frame governed by the self-similar line- 
element (20.132). Equations (20.188), (20.189), and (20.196)—(20.198) hold in the rest frame of the fluid 
in question, the frame where the energy flux f of the fluid is zero. For equations holding in the fluid rest 
frame, the quantities €”, Bm, and hm should be interpreted as evaluated in the fluid rest frame. Some 
quantities, notably v, M/R, Q/R, and A are (dimensionless) scalars, taking the same value in any tetrad 
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frame. Equations (20.191)—(20.199) are dimensionless, factors of R appearing so as to make them so; for 
example Rhm, R?p, Ro are dimensionless. 
Self-similarity requires that each fluid have an equation of state with constant w and w_, equations (20.162), 


w=p/p, wi =pı/p. (20.188) 


If the fluid is charged, then self-similarity requires that its conductivity ø be proportional to the square root 
of the proper energy density, equation (20.166), 


o=Kp?, (20.189) 


with constant dimensionless conductivity coefficient x. 
The proper time 7 in any tetrad frame evolves as 


OE a (20.190) 


which follows from dz/dr = ox and equation (20.174). The circumferential radius R in any tetrad frame 
evolves as 
dla R o 
© de El , 
which follows from dR/dr = 09 R = bo and equation (20.190). 
The defining equations (20.169) for the proper acceleration ho and Hubble parameter hı yield equations 
for the evolution of the time and radial components of the conformal Killing vector €™ in any tetrad frame, 


(20.191) 


0 
a = bı — Rho, (20.192a) 

dx 

1 
-a = — bo + Rhy . (20.192b) 


In the evolution equation (20.192a) for €°, equation (20.172) has been used to convert the conformal radial 
derivative 0, to the conformal time derivative p, and thence to —d/dx by equation (20.174). 

The Einstein equations (20.38) applied to the two expressions (20.35c) for G°! yield evolution equations 
for the time and radial components of the vierbein coefficients Bm in any tetrad frame, 


dBo 1 


“ep (B1Rhı + 4r R?T?) , (20.193a) 
d 1 
-a =i (BoRho + 4r R?T®™) . (20.193b) 


Again, in the evolution equation (20.193a) for 69, equation (20.172) has been used to convert the conformal 
radial derivative 0, to the conformal time derivative 0p. The energy flux T°! in equations (20.193) is the 
total energy flux summed over all species. The 4 evolution equations (20.192) and (20.193) for €” and Bm 
are not independent: they are related by £” Bm = v, a constant, equation (20.135b). To maintain numerical 
precision, it is important to avoid expressing small quantities as differences of large quantities. In practice, 
a suitable choice of variables to integrate proves to be ¿° + €1, Bo — 61, and 81, each of which can be tiny 
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in some circumstances. Starting from these variables, the following equations yield ¿°? — €', along with the 
interior mass M and the horizon function A, equations (20.135a) and (20.135c), in a fashion that ensures 
numerical stability: 


= 2v — (€° + Et) (bo + £1) 


0 1 
=e Bo — Bi (20.194a) 
ZM = 1+ (Bo + Pr)(Bo— Br) ; (20.194) 
A= (E? +EH (E — £!) . (20.194c) 


Equation (20.194b) is numerically preferable to equation (20.177b), which can suffer loss of precision from 
cancellation of large quantities; equation (20.177b) can be used as a check. 

The evolution equations (20.192) and (20.193) involve ho and hı. The integrals of motion considered in 
§20.18.15 yield explicit expressions for hg and hı not involving any derivatives. For the Hubble parameter 
hy, equation (20.177a) gives 


Rh, =—*-Rhy+—, (20.195) 
Vv 


where € is given by equation (20.184). For the proper acceleration ho, a simple case is that of non-interacting 
(collisionless), pressureless, neutral “dark matter,” for which the acceleration vanishes, 


ho =O dark matter . (20.196) 


For a more general fluid, the integral of motion (20.183) yields an expression for ho. If the fluid exchanges 
energy-momentum only with the electromagnetic field, so that the fluxes F™ are given by equations (20.181), 
then the integral of motion (20.183), simplified using the integral of motion (20.179) for Q and the conduc- 
tivity (20.189) in Ohm’s law (20.165), reduces to 


_ Et {8rw (B1! — wog?) R? p + [v + (1 + w)4r Rol] Q?/R? — w(4né%e)?/v} 
Ame [(€*)? — w(€°)?] l 


Finally, equations are needed governing the evolution of the energy densities p of the fluids. If a fluid has 


Rho (20.197) 


isotropic equation of state, w = w], then the energy conservation equation translates into a conservation 
equation (20.186) for entropy (20.187). If the fluid exchanges energy-momentum only with the electromag- 
netic field, so that the flux F° is given by equations (20.181), then the entropy conservation equation (20.186) 
is 


dln S aQ? 
= i 20.198 
dx E1R3(1+ w)p ( ) 
The right hand side of equation (20.198) vanishes if the fluid is uncharged or non-conducting. 
For the electromagnetic field, the energy conservation equation (20.70a) becomes 
dl 4 
i EL (20.199) 


dx ¿l 
If there is more than one charged conducting fluid, then the right hand side of equation (20.199) should 
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be summed over the charged conducting fluids. Equation (20.199) says that (free) energy coming out of the 
electromagnetic field is going into (heat) energy of dissipation of charged conducting fluids. Equation (20.199) 
is numerically preferable to equation (20.179), which can suffer loss of precision from cancellation of large 
quantities; equation (20.179) can be used as a check. 


20.19 Infinite thin planes 


The final problem considered in this Chapter is that of an infinite thin plane in vacuo, not because the 
problem is soluble, but rather because such a thing cannot exist in general relativity. 


20.19.1 Plane symmetric spacetimes 


The next section 20.19.2 considers the situation of a putative infinite thin wall. The assumed planar symmetry 
of the wall implies that the line-element must take the form 


2 
1 


ds? = —a" dt? 4 : (dz — abo dt)? + r?° (dx? + x7d¢?) , (20.200) 
in which the metric coefficients are functions of time t and vertical position z. The planar line-element (20.200) 
is similar but not identical to the spherical line-element (20.1). The radius r(t, z) in the line-element (20.200) 
is an arbitrary function of t and z. The radius r(t, z) can be thought of as a cylindrical cosmic scale factor, 
and the coordinate x as a comoving cylindrical coordinate. The coefficients bo(t, z) and bi(t, z) are likewise 
arbitrary function of t and z; unlike the spherical case, they are not equal to Bm = Omr. Quantities Bm are 
defined to be directed derivatives of the radius r, the same as in the spherical line-element, equation (20.9), 


Lo a5 0 
aðt A 


Bm = Omr = { ” b as 0, o} , (20.201) 
FA z 


As in the spherical case, Bm is a tetrad 4-vector, and its scalar product with itself is a scalar, which defines 
the interior mass M, 
2M 


— = B-B. (20.202) 


The expression (20.202) for the mass M interior to z differs from the spherical case 2M/r = 1 + 65 — 63, 
equation (20.11), because the flat line-element dx? + x7d¢? in (20.200) replaces the spherical line-element 
do? = d8? + sin?6 dg? in (20.1). 
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The tetrad connections are 


01 
T100 = ho =O, na = bı a ; (20.203a 
z 
ð ln ab 
Tio1 = hi = bo x 0 Oy in By , (20.203b 
Z 

T202 = [303 = Pi , (20.203c 

Tr 

2 _ Ay 
P212 =Vsi3 = — , (20.203d 

T 

1 
P323 = —, (20.203e 

rx 


which differ from the spherical connections (20.23) in ho, hi, and T323- 

With the changes to the interior mass M from equation (20.202), and to the connections ho, hı, and T323 
from equations (20.203), all the equations in §20.6 for the Riemann, Ricci, Einstein, and Weyl tensors in the 
spherical case hold unchanged. 


20.19.2 An infinite thin wall? 


In Newtonian gravity, an infinite uniform wall produces a uniform gravitational force towards the wall. If the 
wall has mass per unit area of 6, then solving Laplace’s equation V? = 4mp with a delta-function source 
p = Pô(z) implies that the gravitational force is the constant g = —0¢/0z = —47 at any distance z from 
the wall. This is not what happens in general relativity (Jones, 2008). 

Consider an infinite uniform thin wall in otherwise empty space. The symmetries of the situation imply 
that the line-element must take the form (20.200), with z the vertical coordinate. As remarked at the end 
of §20.19.1, all the equations in §20.6 in the spherical case hold also for the planar line-element (20.200), 
provided that 8m, M, and hm are interpreted as being given by equations (20.201), (20.202), and (20.203). 
For the planar line-element (20.200), the mass equations (20.44) in the centre-of-mass frame become 


OoM 
T =—Anr*p , (20.204a) 
OM /0z 2 
=4 ; 20.204b 
Or /Oz —_ ( ) 
The density and pressure vanish in the vacuum region outside the wall, ọ = p = 0. The mass equa- 


tions (20.204) then imply that all derivatives of M vanish, so the interior mass M is constant everywhere 
outside the wall. 

The vacuum region outside the wall defines no preferred frame, so there is freedom to Lorentz boost 
the spacelike 4-vector Bm in the yo~yı plane (the t-z plane), such that 8, = 0. In accordance with the 
definition (20.201) of 61, the vanishing of 3) requires ôr/ðz = 0, that is, r is a function only of t, independent 
of z. Solving Einstein’s equations in vacuo leads to the result that not only r but all the metric coefficients 
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in the line-element (20.200) are functions only of t, independent of z. The resulting vacuum line-element is 
t dz? 
ds? = — — dt? + — 
° om TE 
The spacetime described by the line-element (20.205) has vanishing energy-momentum tensor, but a Weyl 
scalar C of 


+ €?(dx? + x° dd) . (20.205) 


M 


(20.206) 


The line-element (20.205) is the Kasner (1921) spacetime (Exercise 17.4) with qa = {—4, 2, 2}. Which in 
turn looks like the Schwarzschild geometry near its singular surface (Exercise 17.5). 

It is now apparent why there are difficulties in general relativity in finding a thin wall solution analogous to 
that in Newtonian gravity. The putative thin wall solution is actually the superluminally infalling region near 
the singular surface of the Schwarzschild geometry. Singularity theorems, Chapter 18, imply that, as long as 
the energy-momentum satisfies a positive energy condition, there are geodesics whose future terminates in 
such a geometry. 


21 


The interiors of accreting, spherical black 
holes 


As discussed in Chapter 8, the Reissner-Nordstr6m geometry for an ideal charged spherical black hole 
contains mathematical wormhole and white hole extensions to other universes. In reality, these extensions 
are not expected to occur, thanks to the mass inflation instability discovered by Poisson and Israel (1990). 
This Chapter explores how accretion modifies the internal structure of a spherical black hole. A charged 
black hole is not astronomically realistic, but it has an inner horizon like a rotating black hole, and may be 
considered a surrogate for a rotating black hole. 


Two important lessons emerge from the investigations in this Chapter. The first is that the inner horizon of 
an accreting black hole is subject to the inflationary instability discovered by Poisson and Israel (1990). The 
instability is called inflation because it grows exponentially. The inflationary instability destroys the inner 
horizon, preventing the wormhole and white hole extensions to other universes that occur in the Reissner- 
Nordström geometry for an ideal charged spherical black hole. Poisson & Israel dubbed the instability “mass 
inflation,” but I tend to prefer the term “inflationary instability” since although the interior mass indeed 
increases exponentially during inflation, it is relativistic counter-streaming, not mass, that drives inflation 
(Hamilton and Avelino, 2010). 


The second important lesson of this Chapter is that dissipation inside a black hole can create a lot of 
entropy inside a black hole, causing a problem with the second law of thermodynamics. Normally, the 
quantum field theory postulate of locality — the statement that spacelike-separated quantum operators 
commute — justifies adding entropy along spacelike surfaces. Locality implies that all field operators can 
be set independently along any spacelike surface. Locality is what justifies calculating the entropy of for 
example the air in the room you are sitting in by chopping up the volume of the room into small pieces and 
adding up the entropies of each piece. But inside a (conformally) stationary black hole, surfaces of constant 
(conformal) stationary time are spacelike, and the volume of a spacelike 3-surface over the age T of a black 
hole since it first collapsed is of order TRÈ, which for black holes that collapsed long ago is vastly larger 
than a naive estimate RÈ of the volume of a sphere of horizon radius R}. As shown in §21.10, if entropy is 
accumulated over this vast volume TH the cumulative entropy can vastly exceed the Bekenstein-Hawking 
(Bekenstein, 1973; Hawking, 1974) entropy, which is 1/4 the area of the horizon in Planck units. Which 
would imply a gross violation of the second law of thermodynamics if the black hole subsequently evaporated 
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radiating only a Hawking amount of entropy. Where did all that accumulated entropy generated inside the 
black hole disappear to? 

The problem, and its solution (Polhemus, Hamilton, and Wallace, 2009), are intimately related to the 
Information Paradox introduced in a seminal paper by Hawking (1976). The Information Paradox is that 
black hole evaporation must violate one of two fundamental postulates of quantum field theory, which are 

1. Locality: the proposition that spacelike-separated operators commute; 


2. Unitarity: the proposition that quantum mechanical evolution is deterministic. 

Locality is what enforces causality in quantum field theory. Locality ensures that, although quantum mechan- 
ics allows what appears to be instantaneous communication between spacelike-separated points in Einstein- 
Podolsky-Rosen (EPR) experiments (Einstein, Podolsky, and Rosen, 1935), no actual information can be 
transmitted in such an experiment. The classic EPR experiment is to prepare a pair of particles of non- 
zero spin such that their combined spin is 0, then observe the particles at two spacelike-separated receivers. 
Quantum mechanics predicts, and experiment confirms (Yin et al., 2017), that the particles will always be 
observed to have spin opposite to each other regardless of the direction along which the particles are observed, 
even when that direction is changed at the last moment. It is as if there were some kind of instantaneous 
communication between the pair. Yet no actual information is transmitted in the experiment, because each 
observation leads to spin up or down with equal probability, and neither side can influence which of those 
two choices actually occurs. 

Applied to black hole interiors, the problem with locality is that information inside a black hole must 
exceed the speed of light to escape, which locality prohibits. Hawking (1976) originally argued that this 
would cause a breakdown of unitarity, since the Hawking radiation emitted by the black hole would be 
causally disconnected from the interior states of the black hole. Hawking argued that Hawking radiation, 
being precisely thermal, carries no information. The response to Hawking’s conclusion was not immediate, 
but in due course a growing number of physicists, including Gerard t’Hooft, Leonard Susskind, Don Page, 
John Preskill, and others started arguing that it was more likely that locality, not unitarity, broke down. After 
all, when a black hole radiates Hawking radiation, its mass and area decrease, and the amount of entropy 
in the Hawking radiation is approximately equal to (actually slightly larger than) the Bekenstein-Hawking 
entropy lost by the black hole. How could the two not be causally related, as unitarity insists? This led to 
conjectures that the black hole horizon is a “hologram” that somehow encodes the interior quantum degrees 
of freedom of a black hole. The idea of holography was boosted greatly by Maldacena’s (1998) discovery of 
AdS-CFT, a string-theory duality between an anti deSitter spacetime and a conformal field theory living on 
the boundary of that spacetime. Proponents of holography declared victory (Susskind, 2008). However, it is 
fair to say that holography remains incompletely understood, especially in application to real astronomical 
black holes. 

Anyway, the relevance to the present Chapter is that a breakdown of locality would also save the second 
law of thermodynamics from excessive entropy production inside black holes. When two observers fall into 
a black hole at two different times or angular positions, they lose causal contact with each other, Concept 
Question 7.4, and classically they observe distinct volumes of space. But if locality breaks down, then the 
observers can be seeing the same quantum degrees of freedom even though the volumes are distinct. In effect, 
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there is only one quantum black hole interior, not many. It is not legitimate to accumulate entropy across 
many black hole interiors, even though they are spacelike separated from each other. 

All the models presented in this Chapter are spherical and self-similar. See Hamilton and Pollack (2005), 
Hamilton and Pollack (2005), Wallace, Hamilton, and Polhemus (2008), and Hamilton and Avelino (2010) 
for more detail. 


21.1 Boundary conditions and equation of state 


The previous Chapter 20 set forward the equations governing spherical spacetimes. This section sets out 
the boundary conditions and equation of state adopted for the accreting spherical black hole models in the 
remainder of the Chapter. 


21.1.1 Boundary conditions at an outer sonic point 


Because information can propagate only inward inside the horizon of a black hole, it is natural to set 
boundary conditions outside the horizon of an accreting black hole. The policy adopted here is to set boundary 
conditions at a sonic point, where the infalling baryonic (subscripted b) fluid accelerates from subsonic to 
supersonic. The proper 3-velocity of the baryons through the self-similar frame is €//€?, equation (20.145) 
(the velocity €//€? is positive falling inward), and the sound speed is 


sound speed = , —- Jw , (21.1) 
Pb 
and sonic points occur where the velocity equals the sound speed 


& 
a 
The denominator of the expression (20.197) for the proper acceleration hyo of the baryonic fluid is zero at 
sonic points, indicating that the acceleration will diverge unless the numerator is also zero. Generically, what 
happens at a sonic point depends on whether the fluid transitions from subsonic upstream to supersonic 
downstream (as here) or vice versa. If (as here) the fluid transitions from subsonic to supersonic, then sound 
waves generated by discontinuities near the sonic point can propagate upstream, plausibly modifying the 
flow so as to ensure a smooth transition through the sonic point, effectively forcing the numerator, like the 
denominator, of the expression (20.197) to pass through zero at the sonic point. Conversely, if the fluid 
transitions from supersonic to subsonic, then sound waves cannot propagate upstream to warn the incoming 
fluid that a divergent acceleration is coming, and the result is a shock wave, where the fluid accelerates 
discontinuously, is heated, and thereby passes from supersonic to subsonic. 
The solutions considered here assume that the acceleration hyo at the sonic point is not only continuous 
(so the numerator of (20.197) is zero) but also differentiable. Such a sonic point is said to be regular, and 
the assumption imposes two boundary conditions at the sonic point. 


=+,/wp at sonic points . (21.2) 
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The accretion in real black holes is likely to be much more complicated, but the assumption of a regular 
sonic point is the simplest physically reasonable one. 


21.1.2 Mass and charge of the black hole 
The mass M, and charge Qe of the black hole at any instant are defined here to be those that would be 
measured by a distant observer if there were no mass or charge outside the sonic point, 
Q? 


M=M+—, 
Toy 


Q.e =Q _ at the sonic point . (21.3) 
The mass M, in equation (21.3) includes the mass-energy Q?/2r that would be in the electric field outside 
the sonic point if there were no charge outside the sonic point, but it does not include mass-energy from any 
additional mass or charge that might be outside the sonic point. 

In self-similar evolution, the black hole mass M, increases linearly with proper time at rest far from the 
black hole. The proper time is recorded on dark matter clocks that free-fall radially from rest far away. 
In the approximation that there is vanishing energy-momentum outside the sonic point other than that 
in the electric field, the solution outside the sonic point is Gullstrand-Painlevé. The Gullstrand-Painlevé 
line-element for dark matter that free falls radially from rest at infinity is equation (20.139) with 


Pia =1 (21.4) 
and unit lapse, the latter implying, from equation (20.140) with time T replaced by the dark matter time 
Td, 

oR 
1 = ag = Re | (21.5) 
V Td 


The sonic point is at fixed conformal radius, and equation (21.5) shows that the dark matter time Ta = 
Ra£8/v at that point increases in proportion to the conformal factor Ra. The mass accretion rate Me is 
dM Me vM 


M. — = = = 
dTa Ta alo 


at the sonic point . (21.6) 


As remarked following equation (20.135), the residual gauge freedom in the global rescaling of conformal 
time allows the expansion rate v to be adjusted at will. One choice suggested by equation (21.6) is to set 


M =v, (21.7) 
which is equivalent to scaling v such that 
Me . : 
eo = R at the sonic point . (21.8) 
d 


Equation (21.8) and the boundary condition (21.4) coupled with the scalar relations (20.135a) and (20.135b) 
fully determine the dark matter 4-vectors 8g,m and é} at the sonic point. 
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21.1.3 Equation of state 


The density pp and temperature T, of an ideal relativistic baryonic fluid in thermodynamic equilibrium are 
related by 


T db 
p= Ten, (21.9) 
where 
7 
gb = 9B + gF (21.10) 


is the effective number of relativistic particle species, with gg and gp being the number of bosonic and 
fermionic species. If the expected increase in g with temperature T is modelled (so as not to spoil self- 
similarity) as a weak power law g,/gp = Tf, with gp the effective number of relativistic species at the Planck 
temperature, then the relation between density p and temperature Tp is 


7 gp pute) /w 


21.11 
30 b ? ( ) 


Pb = 
with equation of state parameter w, = 1/(3 + €) slightly less than the standard relativistic value w = 1/3. 
In the models considered here, the baryonic equation of state is taken to be 


wp = 0.32 . (21.12) 


The effective number gp is fixed by setting the number of relativistic particles species to gp = 5.5 at Tẹ, = 
10 MeV, corresponding to a plasma of relativistic photons, electrons, and positrons. This corresponds to 
choosing the effective number of relativistic species at the Planck temperature to be gp ~ 2,400, which is 
perhaps not unreasonable. The precise choices of gẹ and wy are not crucial. 

The chemical potential of the relativistic baryonic fluid is likely to be close to zero, corresponding to equal 
numbers of particles and anti-particles. The entropy Sẹ of a proper Lagrangian volume element V of the 
fluid is then 
(Po + Po) V 


=F 


(21.13) 


which agrees with the earlier expression (20.187), but now has the correct normalization. 


21.2 Black hole accreting a neutral relativistic plasma 


Perhaps the simplest model of an accreting black hole that one could think of is that of a spherical black 
hole accreting a neutral relativistic “baryonic” plasma. In self-similar solutions, the charge of the black hole 
is produced self-consistently by the accreted charge of the baryonic fluid, so a neutral fluid produces an 
uncharged black hole. 
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Figure 21.1 An uncharged baryonic plasma falls into an uncharged spherical black hole. The left panel shows in Planck 
units, as a function of circumferential radius, the plasma density pp, the Weyl curvature scalar C (which is negative), 
and the rate dS,/dSpy of increase of the plasma entropy per unit increase in the Bekenstein-Hawking entropy of the 
black hole, equation (21.44). The mass is Me = 4 x 10° Mo, the accretion rate is Me = 10716, and the equation of 
state is wa = 0.32. The right panel shows a Penrose diagram of the model. 


Figure 21.1 shows the baryonic density pẹ and Weyl curvature C inside the uncharged black hole. The 
mass and accretion rate have been taken to be 


M,=4x10°M,, M,=107'6, (21.14) 


which are motivated by the fact that the mass of the supermassive black hole at the centre of the Milky Way 
is 4 x 10° Mo, and its accretion rate is of order (Planck units are c = G = h = 1) 


Mass of MW black hole _ 4x 10°Mo _ 6 x 10°° Planck units 


x x RIS 21.15 
age of Universe 101° yr 4 x 1044 Planck units ( ) 


Figure 21.1 shows that the baryonic plasma plunges uneventfully to a central singularity, just as in the 
Schwarzschild solution. The Weyl curvature scalar hits the Planck scale, |C| = 1, while the baryonic proper 
density pp is still well below the Planck density, so this singularity is curvature-dominated. 

Figure 21.1 also shows the rate dS,/dSgu of increase of the plasma entropy per unit increase in the 
Bekenstein-Hawking entropy of the black hole, equation (21.44). The relevance of this quantity is discussed 
in §21.10. The constancy of dS,/dSpy in Figure 21.1 reflects the fact that there is no dissipation in this 
model, so no additional entropy is created inside the black hole. 
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Figure 21.2 A charged, non-conducting, baryonic plasma falls into a charged black hole. The black hole has an inner 
horizon like the Reissner-Nordstr6m geometry. The self-similar solution terminates at an irregular sonic point just 
beneath the inner horizon. The mass is Me = 4 x 10° Mo, accretion rate Me = 10716, equation of state wy, = 0.32, 
and black hole charge-to-mass Qe/Me = 1075. The right panel shows a Penrose diagram. The inner horizon is a 
Cauchy horizon: what happens in the spacetime to the future of the Cauchy horizon is unpredictable. 


21.3 Black hole accreting a charged relativistic plasma 


The next simplest model one can think of is that of a black hole accreting a charged relativistic plasma. 
Because the plasma is charged, the resulting black hole is also charged. 

Figure 21.2 shows a black hole with charge-to-mass Q./M. = 1075, but otherwise the same parameters 
as in the uncharged black hole of §21.2: Me = 4 x 10°Mo, Me = 10716, and wp = 0.32. Inside the outer 
horizon, the baryonic plasma, repelled by the electric charge of the black hole self-consistently generated by 
the accretion of the charged baryons, becomes outgoing. Like the Reissner-Nordstr6m geometry, the black 
hole has an (outgoing) inner horizon. The baryons drop through the inner horizon, shortly after which the 
self-similar solution terminates at an irregular sonic point, where the proper acceleration diverges. Normally 
this is a signal that a shock must form, but even if a shock is introduced, the plasma still terminates at an 
irregular sonic point shortly downstream of the shock. The failure of the self-similar solution to continue 
does not invalidate the solution to the past of the inner horizon, because the failure is hidden beneath the 
inner horizon, and cannot be communicated to infalling matter above it. 

The inner horizon is a Cauchy horizon, meaning that the spacetime to the future of the inner horizon 
cannot be predicted uniquely from the past. The ambiguity in the possible presence and location of a shock 
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the other side of the Cauchy horizon is a symptom of this unpredictability. Hamilton and Pollack (2005) give 
further details. 

This solution, in which baryonic matter falls through an outgoing inner horizon, is nevertheless not realistic, 
because it assumes that there is no ingoing matter whatsoever, whereas even the tiniest amount of ingoing 
energy-momentum, in gravitational waves if nothing else, would suffice to trigger the inflationary instability. 
Such ingoing energy-momentum would appear infinitely blueshifted to the outgoing baryons falling through 
the inner horizon, which would produce inflation, as in §21.4. 


21.4 Black hole accreting charged baryons and dark matter 


One way to allow mass inflation in simple models is to admit not one but two fluids that can counter-stream 
relativistically through each other. A natural possibility is to feed the black hole not only with a charged 
relativistic fluid of baryons but also with neutral pressureless dark matter that streams freely through the 
baryons. The charged baryons, being repelled by the electric charge of the black hole, become outgoing, while 
the neutral dark matter remains ingoing. 
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Figure 21.3 Not only charged baryonic plasma but also neutral pressureless dark matter fall into a black hole. The 
dark matter streams freely through the baryonic plasma. The relativistic counter-streaming produces mass inflation 
just above the erstwhile inner horizon, where the centre-of-mass density p (thick black line) and curvature C inflate 
rapidly to the Planck scale and beyond. The mass is Me = 4 x 10° Mo, the accretion rate Me = 10716, the baryonic 
equation of state wp = 0.32, the charge-to-mass Qe /Me = 1075, the conductivity is zero, and the ratio of dark matter 
to baryonic density at the outer sonic point is pg/pp = 0.1. 
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Figure 21.4 (Left panel) The centre-of-mass density p and Weyl curvature |C|, and (right panel) interior mass M, 
inside three black holes accreting baryons and dark matter at three different rates Me = 0.03, 0.01, and 0.003. In all 
three cases the dark-matter-to-baryon ratio at the sonic point is pg/py = 0.1. The smaller the accretion rate, the faster 
the centre-of-mass density p, curvature C, and interior mass M inflate; note that the centre-of-mass energy p (thick 
black line) and the curvature |C| almost coincide here. For the middle accretion rate Me = 0.01 (to avoid confusion, 
only this case is plotted), the graph also shows the individual proper densities pẹ of baryons, pa of dark matter, and 
pe of electromagnetic energy. During mass inflation, almost all the centre-of-mass energy p is in the streaming energy: 
the proper densities of individual components remain small. The black hole mass is Mẹ = 4 x 10° Mo, the baryonic 
equation of state is wa = 0.32, the charge-to-mass is Qe/Me = 0.8, and the conductivity is zero. The position where 
the inner horizon would be for a Reissner-Nordstrém black hole of Qe/Me = 0.8 is marked, but in fact the inner 
horizon is destroyed by the inflationary instability. 


Figure 21.3 shows that relativistic counter-streaming between the baryons and the dark matter causes 
the centre-of-mass density p and the Weyl curvature scalar C to inflate quickly up to the Planck scale and 
beyond. The ratio of dark matter to baryonic density at the sonic point is pa/py = 0.1, but otherwise the 
parameters are the generic parameters of the previous two sections: Me = 4x 10° Mo, Me = 10716, w, = 0.32, 
Q./M. = 1075, and zero conductivity. Almost all the centre-of-mass energy p is in the counter-streaming 
energy between the outgoing baryonic and ingoing dark matter. The individual densities p, of baryons and 
pa of dark matter (and pe of electromagnetic energy) increase only modestly. 

A striking feature of mass inflation is that the smaller the accretion rate, the shorter the length scale 
of inflation. Not only that, but the smaller one of the outgoing or ingoing streams is relative to the other, 
the shorter the length scale of inflation. Figure 21.4 shows black holes with three different accretion rates 
M = 0.03, 0.01, and 0.003, all with the same ratio pa/py = 0.1 of the dark-matter-to-baryon density ratio 
at the sonic point. The smaller the accretion rate, the faster is inflation. The accretion rates M, have been 
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chosen to be relatively large so that the inflationary growth rate is discernible easily on the graph. The 
centre-of-mass density p and Weyl scalar C exponentiate along with, and in proportion to, the interior mass 
M, which increases as the radius r decreases approximately as (see Hamilton and Avelino (2010) for more 
precise estimates) 


pSCXM& exp(—Inr/M,) . (21.16) 


Physically, the scale of length of inflation is set by how close to the inner horizon infalling material approaches 
before mass inflation begins. The smaller the accretion rate Mae, the closer the approach, and consequently 
the shorter the length scale of inflation. 

Figure 21.4 shows that, as in Figure 21.3, almost all the centre-of-mass energy density p is in the streaming 
energy between the baryons and the dark matter. For one case, M. = 0.01, Figure 21.4 shows the individual 
densities p, of baryons and pg of dark matter in their own frames, and pe of electromagnetic energy, all of 
which remain tiny compared to the streaming energy. 

Figure 21.4 also shows that inflation in due course comes to an end, whereupon the spacetime collapses to 
a spacelike singularity at zero radius. Hamilton and Avelino (2010) shows that the maximum interior mass 
attained is approximately the exponential of the reciprocal of the mass accretion rate, 


Minax ~ exp(1/M,) . (21.17) 


For small accretion rates, this interior mass is absurdly huge. For example, for the “realistic” accretion rate 
of Me = 10-16 adopted in the model of Figure 21.3, the maximum interior mass attained is Mmax ~ et?", 
and the maximum proper streaming density p and curvature C are similarly ridiculously vast. The density 
and curvature vastly exceed the Planck scale. 

Curvature is synonymous with tidal force. It seems entirely likely that the tidal force will result in pair 
creation once the curvature exceeds the Planck scale. Frolov, Kristjansson, and Thorlacius (2006) show that 
in the case a charged black hole in 2 spacetime dimensions, such pair creation does in fact occur. However, 


there have been no studies of what happens in the realistic case of 4 spacetime dimensions. 


21.5 The black hole collider 


The previous section, §21.4, showed that almost all the centre-of-mass energy during mass inflation is in 
the energy of counter-streaming. Thus the black hole acts like an extravagantly powerful particle accelerator 
(Hamilton and Avelino, 2010). 

Each baryon in the black hole collider sees a flux nau! of dark matter particles per unit area per unit 
time, where na = pa/ma is the proper number density of dark matter particles in their own frame, and ut is 
the radial component of the proper 4-velocity, the yv, of the dark matter through the baryons. The y factor 
in ut is the relativistic beaming factor: all frequencies, including the collision frequency, are speeded up by 
the relativistic beaming factor y. As the baryons accelerate through the collider, they spend a proper time 
interval dr/dlnu! in each e-fold of Lorentz factor ut. The number of collisions per baryon per e-fold of ut is 
the dark matter flux (pa/Mma)u!, multiplied by the time dr/dlnu!, multiplied by the collision cross-section 
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Figure 21.5 Collision rate of the black hole collider per e-fold of velocity u (meaning yv), expressed in units of the 
inverse black hole accretion time Me /Me. The curves are labelled with their mass accretion rates: Me = 0.03, 0.01, 
0.003, and 10716 (the three models with the larger accretion rates are the same as those in Figure 21.4). Stars mark 
where the centre-of-mass energy of colliding baryons and dark matter particles exceeds the Planck energy, while disks 
show where the Weyl curvature scalar C exceeds the Planck scale. 


a. The total cumulative number of collisions that have happened in the black hole particle collider equals 
this multiplied by the total number of baryons that have fallen into the black hole, which is approximately 
equal to the black hole mass Me divided by the mass m, per baryon. Thus the total cumulative number of 
collisions in the black hole collider is 


number of collisions Me pa oul dr (21.18) 


e-fold of ut mp Ma “ Imu ` 


Figure 21.5 shows, for several different accretion rates Me, the collision rate Mepautdr/d Inu! of the black 
hole collider, expressed in units of the black hole accretion rate Me. This collision rate, multiplied by 
M,o/(mams), gives the number of collisions (21.18) in the black hole. In the units c = G = 1 being 
used here, the mass of a baryon (proton) is 1 GeV ~ 10754 
accelerator units of femtobarns (1 fb = 1074? m?) then the number of collisions (21.18) is 


number of collisions _ 10% ( o ) 300 GeV? Me pau'dr/dlnu' (21.19) 
e-fold of u! E 1 fb mpma 10-16 0.03.M./M. / ` l 


m. If the cross-section o is expressed in canonical 


Particle accelerators measure their cumulative luminosities in inverse femtobarns. Equation (21.19) shows 
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that the black hole accelerator delivers about 104° femtobarns~! (= 100 m?), and it does so in each e-fold of 
collision energy up to the Planck energy and beyond. 

To quote the final sentences of Hamilton (2011): “It appears inescapable that Nature is conducting vast 
numbers of collision experiments over a broad range of peri- and super-Planckian energies in large numbers 
of black holes throughout our Universe. Does Nature do anything interesting with this extravagance — such 
as create baby universes — or is it merely a final hurrah en route to nothingness?” 


21.6 The mechanism of mass inflation 


This section explains why mass inflation occurs, and why it is inevitable as long as even the tiniest streams 
of outgoing and ingoing energy-momentum impinge on the inner horizon. The arguments are from Hamilton 
and Avelino (2010), which gives more detail. For a taste of how this works out mathematically, Exercise 21.1 
takes you through the case of equal pressureless streams. 


21.6.1 Reissner-Nordstr6m phase 


Figure 21.6 illustrates how the two Einstein equations (20.62) produce the three phases of mass inflation 
inside a charged spherical black hole. 

During the initial phase, illustrated in the top panel of Figure 21.6, the spacetime geometry is well- 
approximated by the vacuum, Reissner-Nordstr6m geometry. During this phase the radial energy flux f is 
effectively zero, so 3; remains constant, according to equation (20.62b). The change in the radial velocity 8o, 
equation (20.62a), depends on the competition between the Newtonian gravitational force —M/r?, which is 
always attractive (tending to make the radial velocity 89 more negative), and the gravitational force —4arp 
sourced by the radial pressure p. In the Reissner-Nordstr6m geometry, the static electric field produces a 
negative radial pressure, or tension, p = —Q?/(87r*), which produces a gravitational repulsion —4rrp = 
Q?/(2r?). At some point (depending on the charge-to-mass ratio) inside the outer horizon, the gravitational 
repulsion produced by the tension of the electric field exceeds the attraction produced by the interior mass 
M, so that the radial velocity o slows down. This regime, where the (negative) radial velocity 6o is slowing 
down (becoming less negative), while 8; remains constant, is illustrated in the top panel of Figure 21.6. 

If the initial Reissner-Nordstr6m phase were to continue, then the radial 4-gradient Bm would become 
lightlike. In the Reissner-Nordstr6m geometry this does in fact happen, and where it happens defines the 
inner horizon. The problem with this is that the lightlike 4-vector Bm points in one direction for outgoing 
frames, and in the opposite direction for ingoing frames. If Bm becomes lightlike, then outgoing and ingoing 
frames are streaming through each other at the speed of light. This is the infinite blueshift at the inner 
horizon first pointed out by Penrose (1968). 

If there were no matter present, or if there were only one stream of matter, either outgoing or ingoing but 
not both, then m could indeed become lightlike. But if both outgoing and ingoing matter are present, even 
in the tiniest amount, then it is physically impossible for the outgoing and ingoing frames to stream through 
each other at the speed of light. 
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Figure 21.6 Spacetime diagrams of the tetrad-frame 4-vector Bm, equation (20.9), illustrating qualitatively the three 
successive phases of mass inflation: 1. (top) the Reissner-Nordstrém phase, where inflation ignites; 2. (middle) the 
inflationary phase itself; and 3. (bottom) the collapse phase, where inflation comes to an end. In each diagram, the 
arrowed lines labelled outgoing and ingoing illustrate two representative examples of the 4-vector {80, 81}, while the 
double-arrowed lines illustrate the rate of change of these 4-vectors implied by Einstein’s equations (20.62). Inside 
the horizon of a black hole, all locally inertial frames necessarily fall inward, so the radial velocity 6o = Oor is always 
negative. A locally inertial frame is outgoing or ingoing depending on whether the proper radial gradient 6; = Oir 
measured in that frame is negative or positive. 


If both outgoing and ingoing streams are present, then as they race through each other ever faster, they 
generate a radial pressure p, and an energy flux f, which begin to take over as the main source on the right 
hand side of the Einstein equations (20.62). This is how mass inflation is ignited. 
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21.6.2 Inflationary phase 


The infalling matter now enters the second, mass inflationary phase, illustrated in the middle panel of 
Figure 21.6. 

During this phase, the gravitational force on the right hand side of the Einstein equation (20.62a) is 
dominated by the pressure p produced by the counter-streaming outgoing and ingoing matter. The mass M 
is completely sub-dominant during this phase (in this respect, the designation “mass inflation” is misleading, 
since although the mass inflates, it does not drive inflation). The counter-streaming pressure p is positive, 
and so accelerates the radial velocity 89 (makes it more negative). At the same time, the radial gradient 
bı is being driven by the energy flux f, equation (20.62b). For typically low accretion rates, the streams 
are cold, in the sense that the streaming energy density greatly exceeds the thermal energy density, even 
if the accreted material is at relativistic temperatures. This follows from the fact that for mass inflation to 
begin, the gravitational force produced by the counter-streaming pressure p must become comparable to that 
produced by the mass M, which for streams of low proper density requires a hyper-relativistic streaming 
velocity. For a cold stream of proper density p moving at 4-velocity u™ = {u°,u,0,0}, the streaming energy 
flux would be f ~ pu®u', while the streaming pressure would be p ~ p(u!)?. Thus their ratio f/p ~ u°/ut 
is slightly greater than one. It follows that, as illustrated in the middle panel of Figure 21.6, the change in 
6, slightly exceeds the change in Bo, which drives the 4-vector Bm, already nearly lightlike, to be even more 
nearly lightlike. This is mass inflation. 

Inflation feeds on itself. The radial pressure p and energy flux f generated by the counter-streaming 
outgoing and ingoing streams increase the gravitational force. But, as illustrated in the middle panel of 
Figure 21.6, the gravitational force acts in opposite directions for outgoing and ingoing streams, tending to 
accelerate the streams faster through each other. An intuitive way to understand this is that the gravitational 
force is always inwards, meaning in the direction of smaller radius, but the inward direction is towards the 
black hole for ingoing streams, and away from the black hole for outgoing streams. 

The feedback loop in which the streaming pressure and flux increase the gravitational force, which ac- 
celerates the streams faster through each other, which increases the streaming pressure and flux, is what 
drives mass inflation. Inflation produces an exponential growth in the streaming energy, and along with it 
the interior mass, and the Weyl curvature. 


21.6.3 Collapse phase 


It might seem that inflation is locked into an exponential growth from which there is no exit. But the Einstein 
equations (20.62) have one more trick up their sleave. 

For the counter-streaming velocity to continue to increase requires that the change in $, from equa- 
tion (20.62b) continues to exceed the change in 6o from equation (20.62a). This remains true as long as the 
counter-streaming pressure p and energy flux f continue to dominate the source on the right hand side of 
the equations. But the mass term —M/r? also makes a contribution to the change in 6o, equation (20.62a). 
It turns out (Hamilton and Avelino, 2010) that, at least in the case of collisionless streams, the mass term 
exponentiates slightly faster than the pressure term (in Exercise 21.1, for example, this occurs because in 
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equation (21.28) there is a +6? term in the numerator and a —§? term in the denominator). At a cer- 
tain point, the additional acceleration produced by the mass means that the combined gravitational force 
M/r? + 4rrp exceeds 4rr f. Once this happens, the 4-vector Bm, instead of being driven to becoming more 
lightlike, starts to become less lightlike. That is, the counter-streaming velocity starts to slow. At that point 
inflation ceases, and the streams quickly collapse to zero radius. 

It is ironic that it is the increase of mass that brings mass inflation to an end. Not only does mass not 
drive mass inflation, but as soon as mass begins to contribute significantly to the gravitational force, it brings 
mass inflation to an end. 


21.7 The far future? 


The Penrose diagram of a Reissner-Nordstrém, Figure 8.7, or Kerr-Newman black hole indicates that an 
observer who passes through the outgoing inner horizon sees the entire future of the outside universe go by. 
In a sense, this is “why” the outside universe appears infinitely blueshifted. 

This raises the question of whether what happens at the outgoing inner horizon of a real black hole indeed 
depends on what happens in the far future. If it did, then the conclusions of §21.6, which are based in part 
on the proposition that the accretion rate is approximately constant, would be suspect. A lot can happen 
in the far future, such as black hole mergers, the Universe ending in a big crunch, Hawking evaporation, or 
something else beyond our current ken. 

Outgoing and ingoing observers both see each other highly blueshifted near the inner horizon. An outgoing 
observer sees ingoing observers from the future, while an ingoing observer sees outgoing observers from the 
past. Each stream sees approximately one black hole crossing time elapse on the opposing stream for each 
e-fold increase in blueshift (Hamilton and Avelino, 2010). 

For astronomically realistic black holes, exponentiating the Weyl curvature up to the Planck scale will take 
typically a few hundred e-folds of blueshift, as illustrated for example in Figure 21.4. Thus what happens at 
the inner horizon of a realistic black hole before quantum gravity intervenes depends only on the immediate 
past and future of the black hole — a few hundred black hole crossing times — not on the distant future 
or past. This conclusion holds even if the accretion rate of one of the outgoing or ingoing streams is tiny 
compared to the other. 

From a stream’s own point of view on the other hand, the entire inflationary episode goes by in a flash. 


21.8 Weak null singularity on the Cauchy horizon? 


It is commonly stated in the literature that the generic outcome of inflation is a “weak null singularity on the 
Cauchy horizon.” Weak means that the tidal force, the Weyl curvature, exponentiates to infinity in a finite 
amount of proper time. Null refers to the fact that the streaming velocity between outgoing and ingoing 
streams reaches the speed of light. 

In my view this conclusion is incorrect. The conclusion is an artefact of assuming that after collapsing, 
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a black hole remains isolated for ever, whereas real astronomical black holes accrete, cosmic microwave 
background photons if nothing else. Moreover the conclusion of a weak null singularity ignores the fact that 
the diverging tidal force is likely to result in diverging pair creation, and such pairs would surely act as an 
effective source of accretion, again precipitating collapse. 

The fact that a volume element remains little distorted during inflation even though the tidal force, as 
measured by the Weyl curvature scalar, exponentiates to huge values was first pointed out by Ori (1991). 
The physical reason for the small tidal distortion despite the huge tidal force is that the proper time over 
which the force operates is tiny. 

Dafermos (2005) has proved a number of mathematical theorems that establish that a null singularity 
forms on the Cauchy horizon of a charged spherical black hole accreting a massless scalar field. The situation 
envisaged by the theorems is that of a black hole that collapses and thereafter remains isolated. The collapse 
generates an outgoing Price tail of radiation. The theorems assume that the outgoing Price radiation falls off 
sufficiently rapidly along outgoing null geodesics, and Dafermos and Rodnianski (2005) have proved that the 
required condition on the Price radiation holds for an isolated spherical black hole accreting a massless scalar 
field. The theorems confirm the several analytic and numerical studies that have found a null singularity on 
the Cauchy horizon (Ori, 1991; Bonanno et al., 1994b; Brady and Smith, 1995; Burko, 1997; Burko and Ori, 
1998; Hod and Piran, 1998a; Hod and Piran, 1998b; Ori, 1999; Hansen, Khokhlov, and Novikov, 2005). 

Burko (2002; 2003) finds numerically that a null singularity forms only if the scalar field set up outside 
the horizon falls off sufficiently rapidly, the required degree of rapidity depending on the parameters of the 
problem, such as the charge-to-mass ratio of the black hole. If too much scalar field continues to be accreted, 
then no null singularity forms, and the field collapses to a central singularity. 

All the results are consistent with the estimate (21.16) that the interior mass inflates exponentially with 
an exponent inversely proportional to the mass accretion rate Mae. If the accretion rate goes to zero, Me —> 0, 
then the exponential growth rate becomes infinite, leading to a weak null singularity. 

Frolov, Kristjansson, and Thorlacius (2006) have shown that in the simplified case of a 1+1-dimensional 
charged black hole, if the effects of pair creation of charged particles are taken into account, then the result 
is collapse to a spacelike singularity rather than a null singularity on the Cauchy horizon. The result is 
consistent with the argument of the present paper that as long as there is any source that continues to 
replenish outgoing and ingoing streams near the inner horizon, the ultimate result will be collapse to a 
spacelike singularity. The results of Frolov, Kristjansson, and Thorlacius (2006) suggest that even without 
any direct accretion, pair creation provides a sufficient source of outgoing and ingoing streams. 
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Exercise 21.1. A collisionless two-stream model of inflation. This problem is from Hamilton and 
Avelino (2010). Some of the equations below repeat equations elsewhere in this book, but they are left as is 
so that the problem remains self-contained. 

Einstein’s equations in a spherically symmetric spacetime imply that the covariant rate of change of the 
radial 4-gradient Bm = mr = {ðor, ðır,0,0} in the frame of any radially moving orthonormal tetrad is 
(these are equations (20.62)) 


M 

Dobo = — -z — 4rrp , (21.20a) 
r 

Dobı =4rrf , (21.20b) 


where Do is the tetrad-frame covariant time derivative, p is the radial pressure, f is the radial energy flux, 
and M is the interior mass defined by (this is equation (20.11)) 


2M 


r 


1 = 8? = bmp” = B- b? . (21.21) 


1. Freely-falling stream. Consider a stream of matter that is freely falling radially inside the horizon of 
a spherically symmetric black hole. Let u be the radial component of the tetrad-frame 4-velocity u™ of 
the stream relative to the “no-going” frame where 3, = 0 (the frame of reference that divides outgoing 
frames 3) < 0 from ingoing frames 6; > 0): 


u” = {—Bo/8, —B1/B, 0, 0} = { V L+ u?, u, 0, OF i (21.22) 


Note that 6o is negative inside the horizon for both outgoing and ingoing frames. The time component 
u? = —B9/8 = V1 + u? of the tetrad-frame 4-velocity is positive (as it should be for a proper 4-velocity), 
while the radial component u = ut = — 61/8 of the tetrad-frame 4-velocity is positive outgoing, negative 
ingoing. Show that along the worldline of the stream 


dnB 1f M 2 pı 

dnr  B? | p (2+ Bo )| ee 
dlou 1 |M 2 Bo 

dlor 82 E sai (>+ Br )| l i 


[Hint: If the stream is freely falling, then the proper time derivative 0p in the tetrad frame of the stream 
equals the covariant time derivative Do. Thus the proper rates of change of ln 8 and lnu with respect 
to Inr along the worldline of the stream are 


dlog — Aln8 dinu lnu 
dlar lnr’ dlor lnr ` 


(21.24) 
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These can be evaluated through 


Op In 8 = Don 8 = 555 Pob? = zz Dol 3 — BÈ) 
= 73 (5»DoBo — Bi Dofr) , (21.25a) 
olnu = Doln u = Do ln bı — Do ln 8 
= z Pob: - Dong, (21.25b) 
plnr = öp - Bo : (21.25c) 
r r 


with Einstein’s equations (21.20) substituted into equations (21.25a) and (21.25b).| 

. Equal outgoing and ingoing streams. Consider the symmetrical case of two equal streams of radially 
outgoing (3, < 0) and ingoing (4, > 0) neutral, pressureless, non-interacting matter (“dust”), each of 
proper density p in their own frames, freely-falling into a charged black hole. Show that 


dmg 1 


mn” ap | A+ B? + pu?) , (21.26a) 
dmu = zm A= Ppt pn) , (21.26b) 
where 
A=Q?/r?-1, w= 16rr’p. (21.27) 
Hence conclude that 
dng -A+ 8? 4+ pw? 


= ; 21.28 
dlnu A-— 82+ u+ pu ( ) 
[Hint: The assumption that the streams are neutral, pressureless, and non-interacting is needed to make 
the streams freely-falling, so that equations (21.23) are valid. The pressure p in the tetrad frame of each 
stream is the sum of the electromagnetic pressure pe and the streaming pressure ps 


P= Pe + ps - (21.29) 
The electromagnetic pressure pe is 
Q? 
=— 21.30 
Pe Sart? ( ) 


with Q the charge of the black hole, which is constant because the infalling streams are neutral. The 
streaming pressure p, that each stream sees is 


Ps = plus)” , (21.31) 


where the streaming 4-velocity u% between the two streams is the 4-velocity of the observed stream 
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Lorentz-boosted by the 4-velocity of the observing stream (the radial velocities u! of the observed and 
observing streams have opposite signs) 


= (WY + (uly = 1+ Qu? , (21.32a) 
ut = —2u°u! = -2u V1 4 a2 . (21.32b) 
The energy flux f in the tetrad frame of each stream is the streaming flux f, 

f= fs = pubu}. (21.33) 


You should find that the combinations of streaming pressure and flux that go into equations (21.23) are 


Ps +g fs = 2pu , (21.34a) 
0 

Ps T ah = —2p(1 F u’) z (21.34b) 
1 


l 


. Reissner-Nordström phase. If the accretion rate is small, then initially the stream density p is small, 


and consequently u is small. Argue that in this regime equation (21.28) simplifies to 


d8  -àA+8 


dau ASB (21.35) 
Hence conclude that 
C 
B=; (21.36) 
u 


where C is some constant set by initial conditions (generically, C will be of order unity). 


. Transition to mass inflation. Argue that in the Reissner-Nordstrém phase, 8 becomes small, and u 


grows large, as the streams fall to smaller radius r. Argue that in due course equation (21.28) becomes 
well-approximated by 
dinB —A+ pu 


= 21. 
dlnu À + pu? (AA 


Treating \ and p as constants (which is a good approximation), show that the solution to equation (21.37 
subject to the initial condition set by equation (21.36) is 


C(A + pu?) 


B= zu (21.38 


[Hint: A is positive. In the Reissner-Nordström solution, 8 would go to zero at the inner horizon. ] 


. Sketch. Sketch the solution (21.38), plotting u against 6 on logarithmic axes. Mark the regime where 


mass inflation is occurring. 
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6. Inflationary growth rate. Argue that during mass inflation the inflationary growth rate dln 8/dlnr 


1S 


dmg X 
dinr  20?u ` 


(21.39) 


Comment on how the inflationary growth rate depends on accretion rate (on p). 


21.9 Black hole accreting a fluid with an ultrahard equation of state 


Poisson & Israel’s (1990) original proposal was that mass inflation would be driven by a “Price tail” (Price, 
1972) of gravitational radiation generated during the initial collapse of a black hole. But gravitational radia- 
tion is spin 2, which cannot be accommodated by a spherically symmetric spacetime. There are no spherical 


gravitational waves; the lowest order harmonic of gravitational waves is quadrupole (¢ = 2). 


This has motivated the most common approach in the literature to modeling inflation in spherical space- 
times, which is to allow the black hole to accrete a massless scalar (spin 0) field, which does admit spherical 
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Figure 21.7 Similar to Figure 21.2, but instead of a relativistic fluid, the black hole accretes a charged fluid ¢ with 
an ultrahard equation of state w = 1, which means that the speed of sound equals the speed of light. The fluid 
therefore supports relativistic counter-streaming, as a result of which mass inflation occurs just above the erstwhile 
inner horizon. The mass is Me = 4 x 10° Mọ, the accretion rate Me = 10716, the charge-to-mass Qe /Me = 1075, 


and the conductivity is zero. 


636 The interiors of accreting, spherical black holes 


(£ = 0) waves moving at the speed of light (Christodoulou, 1986; Goldwirth and Tsvi, 1987; Gnedin and 
Gnedin, 1993; Bonanno et al., 1994a; Brady, 1995; Brady and Smith, 1995; Burko, 1997; Burko and Ori, 
1998; Burko, 1999; Husain and Olivier, 2001; Burko, 2002; Burko, 2003; Martin-Garcia and Gundlach, 2003; 
Dafermos, 2005; Hansen, Khokhlov, and Novikov, 2005; Hod and Piran, 1997; Hod and Piran, 1998a; Hod 
and Piran, 1998b; Sorkin and Piran, 2001; Oren and Piran, 2003; Dafermos, 2005; Dafermos and Rodnianski, 
2005). 

No massless scalar field has been observed in nature, although a massive scalar field, the Higgs boson, has 
been observed by the Large Hadron Collider with a mass of ~ 125 GeV (ATLAS Collaboration, 2012), and 
it is likely that cosmological inflation was driven by a massive scalar field possibly with a mass around a 
GUT mass. 

An alternative way to model inflation with a single fluid is with a perfect fluid with sound speed equal 
to the speed of light, yw = 1. This kind of fluid is called ultrahard. An ultrahard fluid is not the same as 
a scalar field, but shares some of its properties (Babichev et al., 2008), notably that it supports spherical 
waves moving at the speed of light. 

Figure 21.7 shows a black hole that accretes a charged, non-conducting fluid with this ultrahard equation 
of state. The parameters are otherwise the same as as in Figure 21.2: a mass of M, = 4x 10° Mo, an accretion 
rate of Me = 10716, and a black hole charge-to-mass of Q./M. = 107. As the Figure shows, mass inflation 
takes place just above the place where the inner horizon would be. During mass inflation, the density pg and 
the Weyl scalar C exponentiate rapidly up to the Planck scale and beyond. The outcome is quite similar to 
that of the two-fluid accretion model of Figure 21.3. 


21.10 Black hole accreting a conducting charged plasma 


As discussed in the introduction to this Chapter, the question of how much entropy might be created inside 
the horizon of a black hole has fundamental implications for the Black Hole Information Paradox. This 
section illustrates the problem with a toy model in which a spherical black hole accretes a plasma that not 
only is charged but also has a finite conductivity, so that dissipation can occur, creating entropy inside the 
horizon. The model is not realistic, but the problem it illustrates is a real one. 


21.10.1 Entropy creation 


Bekenstein (1973) first argued that a black hole should have a quantum entropy proportional to its horizon 
area A, and Hawking (1974) supplied the constant of proportionality 1/4 in Planck units. The Bekenstein- 
Hawking entropy Spy is, in Planck units c = G = A = 1, 


(21.40) 


A 
SBu = 7 - 
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For a spherical black hole of horizon radius R4, the area is A = 47 R}. Hawking showed that a black hole 
has a temperature Ty equal to 1/(27) times the surface gravity « at its horizon, again in Planck units, 
=, (21.41) 
27 
For a spherical black hole, the surface gravity is k+ = —Doßo = M/r? + 4rrp evaluated at the horizon, 
equation (20.62a). 

The proper velocity of the baryonic fluid through the similarity frame equals €}/€?, equation (20.145). 
Thus the entropy Sp, equation (21.13), accreted through the horizon, at conformal radius r}, per unit proper 
time of the fluid is 

doe An REL (1 + wo) pp 
dTa ge Th 


(21.42) 
T el 
Meanwhile the horizon radius R} expands in proportion to the conformal factor, Ry œ e¥, and dt,/dr, = 
oto = 1/(Rv€P), so the Bekenstein-Hawking entropy Spy = 7R%. increases as 
dSpy 27 R? v 


= : 21.43 
dtp Rat? ( ) 


Putting (21.42) and (21.43) together implies that the entropy Sp accreted through the horizon per unit 
increase of the Bekenstein-Hawking entropy Spy is 


dS, _ 2RBE'(1+ w) 
dSpu E R? vT, 


(21.44) 
Tr 
Inside the sonic point, dissipation increases the entropy according to equation (20.198). The entropy varies 
as Sp x RÌEL(L + wy) po/Ty, equation (21.13) with volume V œ Rg}, so the rate of increase of the entropy 
of the black hole, evaluated down to any radius, per unit increase of its Bekenstein-Hawking entropy, is 


dS,  2RZEL(L + ws) po 
dSpu > R? vT, , 


(21.45) 


which looks the same as equation (21.44) but now evaluated at any radius. 


21.10.2 Black hole accreting a conducting relativistic plasma 


If the electrical conductivity of the plasma is small, then the solutions resemble the non-conducting solutions 
of §21.3. But if the conductivity is large enough effectively to neutralize the plasma as it approaches the 
centre, then the plasma can plunge all the way to the central singularity, as in the uncharged case in §21.2. 
The most entropy is created inside the black hole when the conductivity is tuned to equal, within numerical 
accuracy, the critical conductivity above which the plasma collapses to a central singularity. 

Figure 21.8 shows the case where the conductivity equals the critical conductivity, here kẹ = 1.24. The 
parameters are otherwise the same as in §21.3, a mass of M, = 4 x 10° Mo, an accretion rate M = 10716, an 
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Figure 21.8 Here the baryonic plasma falling into the black hole is charged, and electrically conducting. The conduc- 
tivity is set equal (within numerical accuracy) to the critical conductivity above which the plasma plunges to a central 
singularity, since this leads to maximum entropy production inside the horizon. The mass is Me = 4 x 10° Mo, the 
accretion rate Me = 10-16, the equation of state wp = 0.32, the charge-to-mass Qe/Me = 1075, and the conductivity 
parameter kp = 1.24. Arrows show how quantities vary a factor of 10 into the past and future. 


equation of state w, = 0.32, and a black hole charge-to-mass of Qe /Me = 1075. The model is from Wallace, 
Hamilton, and Polhemus (2008). 

The solution at the critical conductivity exhibits the periodic self-similar behaviour first discovered in 
numerical simulations by Choptuik (1993), and known as “critical collapse” because it happens at the bor- 
derline between solutions that do and do not collapse to a black hole. The ringing of curves in Figure 21.8 
is a manifestation of the self-similar periodicity, not a numerical error. 

These solutions are not subject to the mass inflation instability, and they could potentially be prototypical 
of the behaviour inside realistic rotating black holes. For this to work, the outward transport of angular 
momentum inside a rotating black hole must be large enough effectively to produce zero angular momentum 
at the centre. Given that angular momentum transport is a rather weak process (Balbus and Hawley, 1998), 
it seems likely that real rotating black holes do not dissipate all their spin, and that inflation does occur in 
reality. 

Figure 21.8 shows that the entropy produced by Ohmic dissipation inside the black hole can potentially 
exceed the Bekenstein-Hawking entropy of the black hole by a large factor. The Figure shows the rate 
dS,/dSpy of increase of entropy per unit increase in its Bekenstein-Hawking entropy. The rate include 
entropy generated down to radius R; the entropy increases inward because of dissipation. The rate hits 
unity, dS,/dSpu ~ 1, at a radius of about 10~'° of the horizon radius. If the increase of entropy is followed 
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Figure 21.9 This black hole creates a lot of entropy by having a large charge-to-mass Qe/Me = 0.8 and a low accretion 
rate Me = 107?8, but otherwise the same parameters as in Figure 21.8. The conductivity parameter Kp = 1.24 is 
again at the critical value above which the plasma plunges to a central singularity. 


to where the curvature hits the Planck scale, |C| ~ 1, then the entropy relative to Bekenstein-Hawking is 
dS;,/dSpy ~ 101°. 

Since the model is self-similar, the shape of the curves in Figure 21.8 is fixed with respect to conformal 
units, but the conversion to proper (in this case Planck) units varies; the arrows show how the curves vary a 
factor of 10 into the past and future. If the entropy accumulates additively, then instantaneous rate dS, /dSpxy 
shown in the Figure can be interpreted as approximately the cumulative entropy created inside the black 
hole relative to the Bekenstein-Hawking entropy. 

If the entropy created inside a black hole exceeds the Bekenstein-Hawking entropy — here by a factor of 
~ 10'° — and the black hole later evaporates radiating only the Bekenstein-Hawking entropy, then entropy 
is destroyed, violating the second law of thermodynamics. 

This startling conclusion is premised on the assumption that entropy created inside a black hole accumu- 
lates additively, which in turn derives from the assumption that the Hilbert space of states is multiplicative 
over spacelike-separated regions. This assumption, called locality, derives from the fundamental proposition 
of quantum field theory in flat space that field operators at spacelike-separated points commute. This rea- 
soning is essentially the same as originally led Hawking (1976) to conclude that black holes must destroy 
information. 

Generally, the smaller the accretion rate Me, the more entropy is produced. If moreover the charge-to- 
mass Qe/Me is large, then the entropy can be produced closer to the outer horizon. Figure 21.9 shows a 
model with a relatively large charge-to-mass Qe /Me = 0.8, and a low accretion rate Me = 10-78. The large 
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Figure 21.10 Penrose diagram of the accreting, dissipating black hole of Figures 21.8 or 21.9. The entropy passing 
through the spacelike slice before the black hole evaporates (S >> Spy) exceeds that passing through the spacelike 
slice after the black hole evaporates (S ~ Sgu), apparently violating the second law of thermodynamics. However, the 
entropy passing through any null slice respects the second law (S < Spy), consistent with Bousso’s (2002) covariant 
entropy bound. Near the singularity there is a proliferation of spacelike-separated patches of spacetime that cease to 
be in causal contact because their future lightcones cease to intersect. To preserve the second law of thermodynamics, 
locality must break down across these spacelike-separated patches. 


charge-to-mass ratio in spite of the relatively high conductivity requires force-feeding the black hole: the 
sonic point must be pushed to just above the horizon. The large charge and high conductivity lead to a burst 
of entropy production just beneath the horizon. 


21.10.3 Holography 


The idea that the entropy of a black hole cannot exceed its Bekenstein-Hawking entropy has motivated 
holographic conjectures that the degrees of freedom of a volume are somehow encoded on its boundary, and 
consequently that the entropy of a volume is bounded by those degrees of freedom. Various counter-examples 
dispose of most simple-minded versions of holographic entropy bounds. The most successful entropy bound, 
with no known counter-examples, is Bousso’s (2002) covariant entropy bound. The covariant entropy 
bound concerns not just any old 3-dimensional volume, but rather the 3-dimensional volume formed by a 
null hypersurface, a lightsheet. For example, the horizon of a black hole is a null hypersurface, a lightsheet. 
The covariant entropy bound asserts that the entropy that passes (inward or outward) through a lightsheet 
that is everywhere converging cannot exceed 1/4 of the 2-dimensional area of the boundary of the lightsheet. 

In the self-similar black holes under consideration, the horizon is expanding, and outgoing lightrays that 
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sit on the horizon do not constitute a converging lightsheet. However, a spherical shell of ingoing lightrays 
that starts on the horizon falls inwards and therefore does form a converging lightsheet, and a spherical shell 
of outgoing lightrays that starts just slightly inside the horizon also falls inward and forms a converging 
lightsheet. The rate at which entropy Sẹ passes through such outgoing or ingoing spherical lightsheets per 
unit decrease in the area Seoy = mR? of the lightsheet is 


= dS» RŽ v _ 2R,(1 + Wo) Pb 
dSpu R? Elbo o 6al (8o, + BoalTs ’ 


in which the sign is + for outgoing, — for ingoing lightsheets. A sufficient condition for Bousso’s covariant 
entropy bound to be satisfied is 


21.4 
R ae) 


| dS 


|dSy,/dScov| <1 - (21.47) 


The same ideas that motivate holography also rescue the second law. If the future lightcones of spacelike- 
separated points do not intersect, then the points are permanently out of communication, and can behave 
like alternate quantum realities, like Schrédinger’s dead-and-alive quantum cat. Just as it is not legitimate 
to the add the entropies of the dead cat and the live cat, so also it is apparently not legitimate to add the 
entropies of regions inside a black hole whose future lightcones do not intersect. The states of such separated 
regions, instead of being distinct, are quantum entangled with each other. 

Figures 21.8 and 21.9 show that the rate |dSb/dScov| at which entropy passes through outgoing or ingoing 
spherical lightsheets is less than one at all scales below the Planck scale. This shows not only that the black 
holes obey Bousso’s covariant entropy bound, but also that no individual observer inside the black hole sees 
more than the Bekenstein-Hawking entropy on their lightcone. No observer actually witnesses a violation of 
the second law. 

The Penrose diagram 21.10 illustrates the proliferation of spacetime patches near the singularity that 
become causally disconnected because their future lightcones cease to intersect. Holography requires that 
patches are quantum entangled with each other so that the quantum degrees of freedom of volumes inside 
the black hole are the same Bekenstein-Hawking degrees of freedom regardless of who is observing them. 


21.11 Weird stuff at the outer horizon? 


A number of papers have suggested that a magical phase transition at, or just outside, the outer horizon 
prevents any horizon from forming. Is it true? 

For example, could there be there a mass inflation instability at the outer horizon? If there were a White 
Hole on the other side of the outer horizon, then indeed an object entering the outer horizon would encounter 
an inflationary instability. But in real astronomical black holes formed from the collapse of matter, there is 
no White Hole, and no inflationary instability at the outer horizon. 

Some have argued that quantum field theory may somehow blow up at the horizon. Invariably these 
arguments confuse the true (event) horizon with the illusory horizon, §7.27. General relativity is unambiguous 
about what happens at horizons. At least in the macroscopic black holes that exist in our Universe, free-fall 
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frames at the horizon of a black hole are locally inertial, and quantum field theory should remain well-behaved 
there. 

Others have argued that it takes an infinite time for an infalling observer to reach the horizon, and the 
black hole evaporates before the observer reaches the horizon, so in effect no horizon ever forms. Again this 
is incorrect. The reason an outsider sees an infaller take an infinite time to reach the horizon is a light-travel- 
time effect: light emitted at the horizon remains at the horizon for ever, so it takes an infinite time for light 
to lift off the horizon, §7.27. In their own frame, an infaller falls through the horizon and reaches the singular 
surface in a finite proper time. If a second infaller falls in some time after the first infaller, the second infaller 
does not catch up with the first infaller at the horizon. Rather the second infaller sees the first infaller frozen 
on the illusory horizon still ahead, still dimming and redshifting away. 


22 


Ideal rotating black holes 


Among the remarkable mathematical properties of the Kerr-Newman line-element is the fact that, as first 
shown by Carter (1968), the equations of motion of test particles, massive or massless, neutral or charged, 
are Hamilton-Jacobi separable. The trajectories of test particles are thus described by a complete set of 
four integrals of motion. Line-elements with this property are called separable. The physically interesting 
separable spacetimes are A-Kerr-Newman black holes, which are ideal charged rotating black holes in a 
background with a cosmological constant A. 

The proposition of separability imposes certain conditions on the line-element, §22.3, that would be difficult 
to guess a priori. In this Chapter, the Kerr solution and its electrovac cousins are derived by separating 
systematically the Einstein and Maxwell equations. Although conceptually simple, separating the Einstein 
and Maxwell equations is laborious. 

Mathematically, the properties of the Kerr-Newman geometry can be traced to symmetries expressed by 
the existence of two Killing vectors, associated with stationarity and axisymmetry, and a Killing tensor, 
associated with separability, §23.3. It is extraordinary that so simple a set of propositions should lead to so 
intricate a web of implications. 

There are other ingenious mathematical ways to arrive at the Kerr solution (Stephani et al., 2003). I like 
the separable approach not only because of its conceptual simplicity, but also because a generalization of 
separability to conformal separability yields solutions for rotating black holes that undergo inflation at their 
inner horizon, Chapter 24, as astronomically realistic black holes must. 


22.1 Separable geometries 


22.1.1 Separable line-element 


The Kerr geometry is stationary, axisymmetric, and separable. Choose coordinates z” = {t, x,y, 6} in which 
t is the time with respect to which the spacetime is stationary, ¢ is the azimuthal angle with respect to which 
the spacetime is axisymmetric, and x and y are radial and angular coordinates. In §22.3 it is shown that the 
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line-element may be taken to be 


Az 
(1 — wawy)? 


2 dx? i dy? Ay 
Caa As Ay > (1 — wrwy) 


ds? = p? 5 (db — ws dt)? | , (22.1) 


where the conditions of stationarity, axisymmetry, and separability imply that the conformal factor p is 


separable 
p= /pzt pz, (22.2) 


Pz, Wz, Ag are functions of x only , 


and that 


22. 
Py, Wy, Ay are functions of y only . (22.3) 


Thanks to the invariant character of the coordinates t and ¢, the metric coefficients git, gig, and ggo all 
have a gauge-invariant significance, 


2 


= p 2 
Jtt = (1 _ Waly)? (= Az + w,Ay) , (22.4a) 
P 
Jio = F (wyAzr — wry) , (22.4b) 
2 
906 = Tap (—weA; + Ay). (22.4c) 


The condition g,, = 0 defines the boundary of ergospheres, gp = 0 defines the turnaround radius, and 
Jae = 0 defines the boundary of the sisytube. The determinant of the 2 x 2 submatrix of t-¢ coefficients is 


4 


p 
G1t946 — Iig = Ce, AA (22.5) 
xWy 


The quantity A, is the horizon function. Horizons occur where the horizon function vanishes A, vanishes. 
The quantity A, is the polar function, whose vanishing defines not a horizon, but rather the location of the 
(north and south) poles of the geometry. As shown in §23.4, whereas trajectories can pass through a horizon 
into a region where A, has opposite sign, trajectories cannot pass through A, = 0 into a region where A, 
has opposite sign. Without loss of generality, the polar function A, can be taken to be positive, since the 
line-element (22.1) with both A, and A, flipped in sign describes the same geometry with flipped signature. 


22.1.2 A-Kerr-Newman 


As shown in §22.6, the A-Kerr-Newman line-element is obtained by imposing boundary conditions that, at 
least for vanishing cosmological constant, are asymptotically flat far from the black hole, and are non-singular 
at the north and south poles, 0 = 0 and m. For A-Kerr-Newman, the radial and angular parts pẹ and py of 
the separable conformal factor are 


Px =r =acot(ar), py =acos?=—ay, (22.6) 
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where r is the ellipsoidal radial coordinate and 0 the polar angle, as conventionally defined, and a is the 
spin parameter of the black hole. Why use coordinates x and y in place of r and 0? Because the coordinate 
derivatives that arise when separating the Einstein and Maxwell equations, §22.4 and §22.5, are simplest 
when expressed with respect to x and y. The derivative of x is related to that of r by 


= R? R= += L. (22.7) 


Ox Or’ sin(az) 


For A-Kerr-Newman, the coefficients wy and wy in the line-element (22.1) are 
wy = asin6 , (22.8) 


and the horizon and polar functions A, and A, are (the horizon function A, here is related to the earlier 
horizon function A, equation (9.3), by A, = R~?A) 


1 2Mer Q24+Q?2 Ar? 
As z R2 (1 R2 R2 3 ’ (22.9a) 
2 2 
A, = sin26 (1 + sot) , (22.9b) 


where Me is the black hole’s mass, Q, and Q, are its electric and magnetic charge, and A is the cosmological 
constant. By themselves, Maxwell’s equations preclude magnetic charge, in which case Qe = 0. However, any 
grand unified theory large enough to predict the quantization of charge (as observed) necessarily contains 
magnetic charges (magnetic monopoles) as topological defects. In any case, magnetic charge is retained here 
to bring out the symmetry between electric and magnetic charge. The electromagnetic field is purely radial. 
The covariant tetrad-frame electromagnetic potential A, is 


1 Qer Q, cos 0 

Ak= =- 0, 0, -2 , 22.10 
‘ nf A e) (22.10) 
and the radial electric and magnetic fields Æ and B are given by 

Qe +12. 
(px — Ipy)? ’ 


where J is the pseudoscalar of the spacetime algebra, satisfying I? = —1. The Weyl tensor (12.27) has only 
a (complex) spin 0 component, and is 


1 -~ Q+ 2) 
C= = Me ° Su 2 22.12 
k (px — Tpy)? ( px + Ipy aoe 


E+ IB = Fio + 1Ffo3 = (22.11) 


The spacetime is singular at 
Px = py =Q, (22.13) 


which is a ring at r = 0 and 0 = 7/2. 
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22.2 Horizons 
Horizons occur where the horizon function A, vanishes, 
A, =0. (22.14) 


For Kerr-Newman with vanishing cosmological constant, there are outer and inner horizons r+ at 


re = M.+./M?2-@—Q?2- Q2. (22.15) 


If there is a non-zero cosmological constant A, then the horizon condition (22.14) is a quartic in r, and there 
may be as many as 4 horizons. If there is a small positive cosmological constant, then in addition to the usual 
outer and inner black hole horizons, there are cosmological horizons at large positive and negative radii. If 
the cosmological constant is larger and positive, then there are cosmological horizons with no black hole. If 
the cosmological constant is zero or negative, then there are no cosmological horizons. If the cosmological 
constant is sufficiently negative, then there is no black hole. 


22.3 Conditions from Hamilton-Jacobi separability 


This section derives the form (22.1) of the separable line-element from the condition of the separability of the 
Hamilton-Jacobi equation, coupled with the assumptions of stationarity and axisymmetry. The Hamilton- 
Jacobi equation is solved in Chapter 23 to obtain the trajectories of neutral or charged particles in rotating 
charged black holes. 

With respect to an orthonormal tetrad, the Hamilton-Jacobi equation for a test particle of mass m and elec- 
tric charge q moving in a spacetime with vierbein em” and electromagnetic potential Am is, equation (4.110) 


or (4.111), 
mn Os v Os — 2 
n (en 3 = gm) (« = (An) =-m. (22.16) 


© Ox” 


The Hamilton-Jacobi equation (22.16) is a partial differential equation in the particle action S, equa- 
tion (4.36). Let êm” and Am denote the inverse vierbein coefficients and tetrad-frame electromagnetic po- 
tential with an overall conformal factor p factored out: 

Em" = pem” , A= pAm - (22.17) 
With respect to the scaled inverse vierbein êm” and electromagnetic potential Am, the Hamilton-Jacobi 
equation (22.16) can be rewritten 


Os n os z 
mn e 4H — gA,, |) | én” —gA, | = -m° p? . 22.18 
n ( amt )( T. ) mp (22.18) 
To separate the Hamilton-Jacobi equation (22.18), one demands that the left and right hand sides of the 


equation be sums of terms each of which depends only on a single coordinate. The “simplest possible way” 
(Carter, 1968b) to separate the left hand side of the Hamilton-Jacobi equation (22.18) is to impose that each 
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of the individual factors, comprising the scaled inverse vierbein coefficients êm”, the derivatives 05/02" of 
the action, and the scaled potentials Âm, is a function of a single coordinate, and that products of factors 
are non-vanishing only when all factors are functions of the same coordinate. The derivatives 0S/Ox" of the 
action are each functions of a single coordinate provided that the action S is itself a sum of terms S,, each 
depending on a single coordinate x", 


S= > Sule"). (22.19) 


Canonical momenta are equal to derivatives of the action, equation (4.105), so the condition (22.19) imposes 
that each canonical momentum 7,, be a function only of the corresponding coordinate x", 


Su 
ox” 
A special case of the condition (22.20) occurs when a canonical momentum 7,, is a constant, which occurs 


when the metric is independent of the coordinate z”, equation (4.50). In the case of the Kerr geometry and its 
cousins, the spacetime is stationary and axisymmetric. Stationary means that the geometry is invariant with 


Ty = = function of x" . (22.20) 


respect to some time coordinate t, while axisymmetry means that the geometry is invariant with respect 
to some azimuthal angular coordinate ¢. The corresponding canonical momenta m, and mẹ are constants 
of motion, defining respectively the constant energy E and the azimuthal angular momentum L of the 
trajectory, 
Os Os 
= —— = =F T. = —— 
at yO OG 


If the two remaining coordinates are denoted x and y, then the particle action S, equation (22.19), is the 


m =L. (22.21) 


sum 


S = — Et + Lọ + Ss (x£) + Sy (y) , (22.22) 


where S,(a) = f Ta dz and Sy(y) = f Ty dy, equation (22.20), are respectively functions only of x and y. 

Given that m, and 7, are constants, while 7, and 7, are respectively functions of x and y, the left hand 
side of the Hamilton-Jacobi equation (22.18) separates as a sum of terms each of which depends only on x 
or only on y provided that 


either êm” for all ju, and Âm, are functions of x only, and êm” =0, 
for each m, . = j ee (22.23) 
or êm” for all u, and Am, are functions of y only, and êm” =0. 
The case that matches the Kerr and related geometries is the 2+2 choice 
t = d1 
the { eee \ condition of (22.23) holds for { sa pe 3 \ F (22.24) 
Thus separability consistent with Kerr requires that 
êg” = 614 = 8" = é =0. (22.25) 


Given the separability conditions (22.25), the inverse vierbein coefficients ĉo” and é3” can be transformed 
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to zero by a tetrad gauge transformation consisting of a Lorentz boost by velocity ĉo” /êı” between tetrad 
axes Yo and +1, and a (commuting) spatial rotation by angle atan(é3" /é2”) between tetrad axes y2 and 73. 
Thus without loss of generality, 


êo” = ê” =0. (22.26) 
The gauge conditions (22.26) having been effected, the inverse vierbein coefficients é,', é', 61°, and ê? can 
be eliminated by coordinate gauge transformations t > t and ¢ — ¢’ defined by 
t 


a A 
dt = dt! +2 de + &— dy, dọ = dø + 
ê” E24 


$ êa? 
x 6a! 


ey 
ey 
Equations (22.27) are integrable because ê” and ê” are respectively functions of x and y only. The trans- 
formations (22.27) of t and ¢ are admissible because they preserve the Killing vectors 0/0t and 0/04, 


o o o o 


Blend lese Olies Phea ae 
Thus without loss of generality 
êt = êt = ê? = ê? =0. (22.29) 
Finally, coordinate transformations of the x and y coordinates 
groan, yoy’, (22.30) 


can be chosen such that ê” is any function of x, and ê” is any function of y. A choice that proves advan- 
tageous in separating the Einstein and Maxwell equations is 


ê” êo! = êz” êz? = 41. (22.31) 


The separability conditions (22.23) with the 2+2 choice (22.24), which imply conditions (22.25), coupled 
with the gauge conditions (22.26), (22.29), and (22.31), bring the inverse vierbein €m” to the form 


1 w 


— 0 0 = 
VJ Ax VJ Ax 
1 0 —V/A, 0 0 
En = — ; (22.32) 
P 0 0 \/ Ay 0 
Wy 0 0 1 


Vay Vay 


where wz, and A, are some functions of x, and wy and A, are some functions of y. The minus sign in e1” 
is chosen so that, for A-Kerr-Newman, the radial tetrad basis vector yı points outward, the direction of 
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increasing radius r but decreasing x. The corresponding vierbein e™, is 


VAy wy Ag 


1 — wywy o o “t= Wagy 
1 
0 — 0 0 
VAs 
e", =p 1 ; (22.33) 
0 0 0 
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1 — wyWy 1 — wywy 


which implies the line-element (22.1). 

The above form (22.32) of the inverse vierbein was derived from the condition that the left hand side of 
the Hamilton-Jacobi equation (22.18) be separable. Given stationarity and axisymmetry, the left hand side 
is a sum of two terms, one depending on the radial coordinate x, the other on the angular coordinate y. If 
the mass m is non-zero, then the squared conformal factor p? on the right hand side of the Hamilton-Jacobi 
equation (22.18) must also separate as a sum of terms depending on x and y. This is the condition (22.2). 

If Hamilton-Jacobi separability is demanded only for massless particles, m = 0, then a more general class 
of conformally separable solutions can be found, which are explored in Chapter 24. 


Exercise 22.1. Explore other separable solutions. The above derivation of the form of the line-element 
assumed not only separability, but also stationarity and axisymmetry, and the 2+2 choice (22.24) that 
matches Kerr. Explore other possible choices (Carter, 1968b). 


Exercise 22.2. Explore separable solutions in an arbitrary number N of spacetime dimensions. 


22.4 Electrovac solutions from separation of Einstein’s equations 


As shown in §22.3, the assumptions of stationarity, axisymmetry, and separability, coupled with some 
other auxiliary assumptions (separability “in the simplest possible way” (Carter, 1968b)), and the 2+2 
choice (22.24)), imposes the form (22.1) of the line-element and the conditions (22.2) and (22.3). Given 
this form of the line-element, the Kerr solution and its electrovac cousins can be derived by separating the 
Einstein equations systematically. 


22.4.1 Electrovac energy-momenta 


The energy-momentum of a static radial electromagnetic field is 


Q2 + Q2 


8rT°,,, = 7 * diag(1, —1, 1,1) . (22.34) 
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The energy-momentum of a cosmological constant A is 


8nTA, =—Anmn - (22.35) 


22.4.2 Separation of 8 Einstein equations with zero source 


Given the form (22.1) of the line-element and the conditions (22.2) and (22.3), 4 of the 10 tetrad-frame 
Einstein components Gmn vanish identically: 


Go1 Goz G13 G23 0. (22.36) 
Of the remaining 6 Einstein components Gmn, the following 4 have zero electrovac source: 
0?(1/p) 3 dw, dw 
°Gg = —2,/A,A Z 22.37 
Ea v |P Oxdy A(1—wawy)? dx dy | ’ ( i 
AzA 2 z 2 , 
p? Gos = a |2 o da ð pe du (22.37b 
2p? Ox \1l—wawy dx Oy \1—wrwy dy 
Oh. a a(1/p) 1 dwy \? 
= 1 u 22. 
p“ (Goo + G11) ea E ( WyWy) Ər + -aN dy ; (22.37c 
2A ð A(1/p) 1 du, \” 
2 — = y 1 z F 22.37d 
P (Ga-Ga) = = lo (a ~ wew) PD) + aol (22.37 


If the conformal factor p? is supposed to separate as a sum of radial and angular parts, equation (22.2), then 
the homogeneous version of equation (22.37a) reduces to 
(pz) dlo) (02 +P) 


a ep =0. (22.38) 


Series expansion of (22.38) leads to the result that 


2— ,2 De aa 1— WrWy 
ES SS ee (22.39a) 
= Jo — GiWx = J — Joey i 
Pa VT T figo)(fo T fiwe) , Py m 4 figo)(fi "a fowy) ’ (22.39 ) 


where fo, f1, go, and g; are constants. At this point the constants go and gı can be adjusted arbitrarily without 
affecting p: the overall normalization of go and gı is cancelled by the normalizing factor of 1/./fogi+figo in 
Px and py, and the relative sizes of go and gı can be changed by adjusting an arbitrary constant in the split 
between p2 and p3- Given the expression (22.39) for the conformal factor p, the Einstein component Go3, 
equation (22.37b), reduces to 


JRA, > d m( dws /dz ) dwy d m( dwy/dy )| , (22.40) 


2(1— wrwy) | dz dx \ fo + fiws dy dy \ fit fowy 


p° Gos = 
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Homogeneous solution of this equation can be accomplished by separation of variables, setting each of the 
two terms inside square brackets, the first of which is a function only of x, while the second is a function 
only of y, to the same separation constant 2f2. The result is 


on = ats + fiwe) fø + (fijo + fa) , (22.41a) 
z 0 
a = aje + fowy) B + zon + JaJa , (22.41b) 


for some constants gg and gı, which can be taken without loss of generality to equal those in the conformal 
factor (22.39). With the conformal factor p given by equation (22.39) and dw,/dx and dw,/dy given by 
equations (22.41), the Einstein components Goo + G11 and G22 — G33 reduce to 


2 E (f2 + fogi + figo)(fi + fowy)? 
p“ (Goo + G11) = 2Az aea ; (22.42a) 
+ fogs + + fiwa)? 
2 7 = (f2 + fog + figo)(fo l soley 
p“ (G22 — G33) y AA eee ( ) 
These vanish provided that the constant fo satisfies 
fo = —(fogi + figo) - (22.43) 
Inserting this value into equations (22.41) implies 
dw, 
Te 72V fo + fiwe) (Go = gwe) | (22.44a) 
dwy 
Te TV + fowy) (91 = gow) | (22.44b) 


The sign of the square root for dw, /dz is the same as that for py, while the sign of the square root for dwy /dy 
is the same as that for py. 


22.4.3 Separation of the remaining 2 Einstein equations 


Define Y, and Yy by 


Y; = A 


In (io fiwe) | ; (22.45a) 


d 
7 dx 
l d d 
Y, = Asg ™ [ fitfowy) =| (22.45b) 
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In terms of Y, and Y}, the Einstein components Goo — G11 and G22 + G33 are 


py (Goo = Gi) (22. 46a) 
1 dln wg dlnw d fotfiwe OY, d wy (fit fowy) 
= Ys Y, 2) +Ys—l Y Y= hn] v 
1 — wWyWy dx Y dy ) ae dz ( We Oy a ” dy 5 dwy/dy 
pP? (G22 + G33) (22.46b) 


E 1 y Uwe y dln wy y d In fit fowy : OY, d in wael fot fiwe) 
Ox dx 


= Ya 
1— WyWy dx ” dy ” dy Wy dw, /dx 


Homogeneous solutions of these equations can be found by supposing that Y, is a function only of the radial 
coordinate radius x, while Y, is a function only of the angular coordinate y, and by separating each of the 
equations as 


=0, (22.47) 


(1 — wrwy) Wy Wy Wy Wy 


1 (ee Attentato) fohoth3we fihithgwy 


for some constants ho, hı, h2, and h3. Separating each of equations (22.46) according to the pattern of 
equation (22.47) leads to the homogeneous solutions 


(fo + fiwa)(ho + hiwe) y= (fi + fowy) (hi + howy) 
dw, [dx Te dwy/dy ` 


Y, = (22.48) 

Solutions including the energy-momentum of a static electromagnetic field fall out with little extra work. 
With appropriate boundary conditions, this is the Kerr-Newman solution. Solutions with Goo = -G11 = 
G22 = G33, as is true for a static radial electromagnetic field, are found by taking the difference of equa- 
tions (22.46) and separating that difference in the pattern of equation (22.47). The solution is a sum of a 
homogeneous solution (22.48) and a particular solution 


2(Q? + Qo + fwr)? 


Yz = , 
dw, /dx 


Y, =0. (22.49) 


Inserting equations (22.49) into the Einstein expressions (22.46) yields Einstein components that have pre- 
cisely the form (22.34) of the tetrad-frame energy-momentum tensor of a static radial electromagnetic field. 


Similarly, solutions including vacuum energy, which has Goo Gi Goo G33, can be found by 
separating the sum of equations (22.46) in the pattern of equation (22.47). A particular solution is 


2A 24w? 


w= . we ; 
?dwz/dx Y” f?dwy/dy 


(22.50) 


Inserting equations (22.50) into the Einstein expressions (22.46) yields Einstein components that have pre- 
cisely the form of a cosmological constant, Gmn = —AnNmn.- 

Solving equations (22.45a) and (22.45b) with Y, and Y, given by a sum of the homogeneous, electro- 
magnetic, and vacuum contributions, equations (22.48), (22.49), and (22.50), yields the general electrovac 


22.5 Electrovac solutions of Mazwell’s equations 653 


solution for the horizon and polar functions A, and Ay, 


2M v/(fo + fiwz)(go — giwe)  (Q2 + Q2)(fo + fiwe) 
D Vo j yaa) im + ae (fom F figo)3/? i fog. + Fas 
A(go = giw) 
3filfogi + figo)? ’ [ramia 
7 2NeV/(fi + fowy) (G1 — gowy) Awy(g1 — gowy) 
Ay = (fi + fowy) | (ki + kowy) TE o y un - at (22.51b) 


where ko and kı are arbitrarily adjustable constants arising from the freedom of choice in the constants ho 
and hı of the homogeneous solution. The constant Me in the expression (22.51a) for A, is the black hole’s 
mass. The constant Ne in the expression (22.51b) for A, is the NUT parameter (Taub, 1951; Newman, 
Tamburino, and Unti, 1963; Stephani et al., 2003; Kagramanova et al., 2010), which is to the mass M, as 
magnetic charge Q, is to electric charge Qe- 


22.5 Electrovac solutions of Maxwell’s equations 


22.5.1 Solution of Maxwell’s equations 


Write the electromagnetic potential A, in terms of a scaled electromagnetic potential Ax, equation (23.2). 
Separability of the Hamilton-Jacobi equations requires, equations (22.23) and (22.24), that 


At, A, are functions of x only , 


22.52 
Ay, Ag are functions of y only . [oar 


For the line-element (22.1), and with the conditions (22.52), the non-vanishing components of the tetrad- 
frame electromagnetic field Fmn are the radial electric E and magnetic B fields 


E=Fyo= 5 (= + a =) l (22.53a) 
B= Fy = 5 (5 ! el Ze) . (22.53b) 

The remaining components of the electromagnetic field vanish identically, 
Foz = Fos = Fiz = Fig = 0. (22.54) 


Since the electromagnetic field Fmn does not depend on either A, or Ay, these components are pure gauge, 
and can be set to zero, 


As = Ay =0. (22.55) 


Since the electromagnetic field given by equations (22.53) and (22.54) is the curl of the potential, the field 
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automatically satisfies the source-free Maxwell’s equations, The sourced Maxwell’s equations are 
D” Finn = Aiia (22.56) 


Stationary solutions require vanishing current, jn = 0. Two of the sourced Maxwell equations vanish identi- 
cally, and the corresponding currents vanish automatically: 


jr =jo.=0. (22.57) 


Given the expressions (22.44) for dw,/dx and dw,/dy, the remaining two sourced Maxwell’s equations can 
be written 


VAx | OZ; o 1 dwy z 1 dwy 
p> | Ox 


VAy E F 


EZ, l 
p? Oy * Oy m G — WrwWy dy 


where Z; and Zy are defined to be 


_ dw, ð A: 

A= dx Ox (r) , G 
_ dwy O Ag 

AS (z) . (22.59b) 


The homogeneous solutions of equations (22.58) are 
Z; =Z;=0. (22.60) 


Homogeneous solution of equations (22.59) yields 


At Q. 
dw,/dx ~~ —-2(fogi + figo) ` eel 
As Qe 
= .61b 
dwy/dy 2(fogı + figo) ` eel) 


where Qe and Q. are constants of integration, which can be interpreted as respectively the enclosed electric 
charge within radius x, and the enclosed magnetic charge above latitude y. Inserting the solutions (22.61) 
for A, into the expressions (22.53) yields the electric and magnetic fields (22.11). 


22.5.2 Separation of Maxwell’s equations 


The form (22.58) of the Maxwell equations for jọ and ją assumed that dw,/dz and dw,/dy satisfy the 
equations (22.44) obtained by separating Einstein’s equations. However, the Maxwell equations can also be 
separated directly, and the conditions (22.66) that result are consistent with the Einstein conditions (22.44). 
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If one provisionally supposes that Ag = 0, then the Maxwell equation for the angular current j3 is 


JA 
At y {= ja ig Eaa rs | P dwy fja a in (22) | Ta) 


P -— wrwy)? | dx dx dz dy dy dy 
= 4rj3 . (22.62) 
Conversely, if one provisionally supposes that A; = 0, then the Maxwell equation for the radial current jo is 
VA l l d dl dl 
Ag V Ar dwy a ee n(Ag/wy) | dln wy | Wyr E, d In NW, $ N Wy 
p3(1 — wrwy) dy dy dy dx dx dx dx 
= 4r jo . (22.63) 


The homogeneous solutions of equations (22.62) and (22.63) prove to be the homogeneous solutions of the 
full equations without any restriction on A; or Ag. Equation (22.62) separates with 4 separation constants 
qi as 


— qo + qaw — 2q1We — Qw? sga Oa q2 + 2qiwy — qow? 
(1 = way) ( qo + 93 z) p 2s s o wsos) ( q2 % r) P y vo, 
Wey Wey, Wy Wy 


(22.64) 
and equation (22.62) separates in a similar fashion, the vanishing of the second term inside braces in either 
of equations (22.62) or (22.63) requiring that 


qg =q. (22.65) 
The separated solutions for wy and wy are 
dwg 
z = 24/90 — 2q1w2 — Qw? , (22.66a) 
x 
dwy 
a 2/42 + Qqiwy — qow? . (22.66b) 


These are consistent with the separated solution (22.44) found for Einstein’s equations provided that 


g=fo9, 2H =fom—figo, g=fig - (22.67) 


22.6 A-Kerr-Newman boundary conditions 


The electrovac solutions of physical interest are those that go over to asymptotically flat space far from the 
black hole, at least in the absence of a cosmological constant. The condition of being far from the black 
hole can be interpreted as meaning where the influence of the mass and charge of the black hole becomes 
negligible. Inspection of expression (22.51a) for the radial horizon function A, shows that the effect of mass 
and charge becomes negligible where fo + fiw. — 0. Expression (22.39) for the separable conformal factor 
p shows that the conformal factor diverges where fo + fiw, — 0, confirming that this location is indeed “at 
infinity.” 
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The quantity w, is the angular velocity at which the tetrad frame (which has been chosen to align with the 
principal frame) moves through the coordinates. This follows from the fact that the tetrad-frame 4-velocity 
relative to itself is by definition u™ = {1,0,0,0}, so the coordinate frame velocity of the tetrad frame is 
UH = ep"u™ = eg“, so the angular velocity of the tetrad frame is dé/dt = e9°/eo' = wy. If the tetrad 
frame is not rotating through the coordinates at infinity, then the angular velocity w, vanishes at infinity. 
Since infinity is where fo + fiw, vanishes, a tetrad frame that is corotating with the coordinates at infinity 
corresponds to fo = 0. Below, equations (22.76), it is shown that the situation where fo is non-zero differs 
by a coordinate transformation from the case where fo is zero. Thus fo may be set equal to zero without 
loss of generality. 

Further conditions follow from requiring that the metric coefficients g and gss, equations (22.4), vanish 
at the poles of the rotation axis, 0 = 0 and 7, to avoid singular behaviour at the poles. The vanishing of gi 
and gg at the poles requires that both w, and A, must vanish at the poles. 

Connection with familiar polar coordinates {r,6,¢} may be established by requiring that the metric 
coefficients (22.4) go over to their asymptotic expressions in the absence of a cosmological constant or NUT 
parameter, 


gt —>—1, gy —>0, gpp —r°sin?h asr—oo. (22.68) 


The expressions (22.51) for the horizon functions A, and A, then imply that, in the absence of a cosmological 
constant or NUT parameter, 


w. w 1 
Ay = ~ =sin9, As > Æ> aroo, (22.69) 
a a r 
where a = 1/(fıko) is some constant, which proves to be the familiar spin parameter, and an overall 


normalization has been fixed by scaling the conformal factor to p —> r at infinity, a natural choice. The 
normalization of p fixes fı = a7". 

Integrating the relation (22.44b) between wy and y with fo = 0 establishes that wy is quadratic in y. 
Requiring that the polar part of the metric gyy dy? = p?dy*/A, be non-singular at the poles implies that y 
is proportional to cos @ plus a constant that can be set to zero without loss of generality. Requiring that the 
polar metric go over to its asymptotic expression gyy dy? —> r?d0? as r — oo fixes the normalization 


y=-—cosé, (22.70) 


where the sign is chosen so that dw/dy has the same sign as py (eq. 22.75), in accordance with equa- 
tion (22.44b). Expression (22.70) can be imposed also in the presence of a cosmological constant and a NUT 
parameter. For A-Kerr-Newman with no NUT parameter, 


wy =asin?6 . (22.71) 


Equation (22.71) is not true if the NUT parameter is non-vanishing, a case deferred to §22.7. The expres- 
sions (22.70) and (22.71) are consistent with the relation (22.44b) between them provided that gı = ago and 
go = a/ fı. The complete set of constants in equations (22.51) is 


pao, fee"; p=, ge") eee, k=0. (22.72) 
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The radial variable x analogous to the angular variable y of equation (22.70) comes from solving equa- 


tion (22.44a), dw,/dx = 2,/aw,(1 — awx), which gives 
1 — 
£ = — asin /awy, . (22.73) 
a 


A pair of radial and angular variables that emerged naturally from the analysis, besides xz and y, are 
Px and py defined by equation (22.39b). In terms of wg, the radial variable is pz = ya(1 — aw,z)/we. It is 
conventional to define the radial coordinate r to be equal to px, which is consistent with the asymptotic 
behaviour p —> r as r — oo, in which case 


Wy = “  R=VrP +a. (22.74) 


The radial and angular variables p, and py are 
Po =T, Py =acosé. (22.75) 


This completes the derivation of the A-Kerr-Newman solutions. 


22.6.1 There are no separable electrovac solutions that rotate at infinity 


The A-Kerr-Newman boundary conditions in §22.6 took fo = 0, corresponding to the situation where the 
tetrad-frame is corotating with the coordinates at infinity, wx — 0 as r > co. What happens if fo is non-zero? 
If fo is non-zero, then the tetrad rotates through the coordinates with some constant finite angular velocity 
Woo at infinity. As argued at the beginning of §22.6, infinity is where fo + fiw, = 0. Thus a finite angular 
velocity at infinity corresponds to fo = —fiwo.- However, the apparent rotation at infinity can be removed 
by transforming the azimuthal coordinate ¢ so that it corotates at infinity. The line-element can then be 
brought to standard electrovac form with fp = 0 by a coordinate transformation of the angular coordinate 
y. Specifically, the coordinate transformations 


P=G+ Wot, dy’ =(1—wowy) dy , (22.76) 


bring the line-element with wə 4 0 to the standard separable electrovac form with wə = 0, 


Az d 2 d 12 A’ 
ds? = è |- a (at wl db)” + — + + aaay (ae wi dt)’| , (22.77) 
ay me y 7 Matty 
with primed quantities 
Po = a Wy A’ = Ay 292 
Wy E Wx — Woo, Wy = 7— TA nica (22.78) 


Notice that the physical location of north and south poles, at wy = A, = 0, is unchanged by the choice of 
coordinates. 
Thus, among separable electrovac solutions, there are no solutions that physically rotate at infinity. 
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22.7 Taub-NUT geometry 


Spacetimes with a finite NUT parameter N, were discovered by Taub (1951) and Newman, Tamburino, and 
Unti (1963). A nice review of the problematic nature of such spacetimes is given by Kagramanova et al. 
(2010). 

Black holes with a finite NUT parameter N, have the property that the coefficient w,, equation (22.83), 
does not vanish at one or both poles. As usual for a well-behaved azimuthally symmetric spacetime, when 
a geodesic passes infinitesimally close to a pole, the azimuthal angle ¢ along the geodesic jumps by +7, the 
sign depending on which side of the pole the geodesic is deemed to pass; but since the angle ¢ is periodic in 
2r, the ambiguity in sign does not lead to a net ambiguity in the angle ¢. The problem for NUT spacetimes, 
where wy does not vanish, is that when a geodesic passes infinitesimally close to a pole, the time coordinate 
t also jumps by tw,7. The value of wy at a pole, 0 = 0 or 7, is 


wy =2Ne(c F 1), (22.79) 


where the + sign on the right hand side is — at the north pole 0 = 0, and + at the south pole 0 = 7. The 
jump in the time coordinate is problematic, because it means that a particle passing through a pole leaps 
forwards or backwards in time, the choice of forwards or backwards depending on which side of the pole the 
geodesic is deemed to pass. Misner (1963) argued that the discontinuity in time could be solved by making 
the time coordinate t periodic with period 27/|w;-|. A periodic time coordinate does not of course describe 
the real Universe. 

Associated with the discontinuity in the time coordinate t around a pole, black holes with a finite NUT 
parameter N, have closed timelike curves circulating around one (if ce = +1) or both (if ce 4 +1) polar 
axes, so violate causality. NUT black holes are fun, but not physically realistic. 

What accounts for the singular behaviour along poles? The answer is that there is a string of torsion along 
each pole. As described in §2.19.2 and examined further in §16.17, the Riemann and torsion tensors are 
the fields associated with the two gauge groups of general relativity, the Lorentz group and the translation 
group. The Riemann and torsion tensors describe how a frame respectively Lorentz-transforms and translates 


when parallel-transported around an infinitesimal loop. Torsion is sourced by spin angular-momentum Yimn, 
816.11. Although classic general relativity assumes that torsion vanishes, torsion should not be dismissed 
summarily, because spinor fields do carry spin angular-momentum that generates torsion, Exercise 16.5. 
However, the spinors familiar in the real world, such as electrons, are point-like and massive, whereas NUT 
strings are string-like and massless. 


22.7.1 Taub-NUT line-element 


When the NUT parameter Ne is finite, and with the boundary conditions that there are (north and south) 
poles at A, = 0, the conditions (22.72) generalize to 


fi=0; hsa, wea, gee, (22.80) 
ko = a7"? [1 — TA(a? — 2ce Ne + N2)], ky =—2a79/°N, [Ne + ace + 207AN, (C2 1)] , 
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Figure 22.1 Geometry of a Kerr-NUT black hole, with NUT parameters Ne = 0.75M and ce = —1, and spin parameter 
a = 1.2M. The outer sisytube encircles the northern polar axis outside the outer ergosphere, while the inner sisytube 
encircles the extension of the northern axis into the Antiverse inside the inner ergosphere. The choice ce = —1 
means that there is no sisytube around the southern polar axis. Dashed purple lines mark the boundaries gi: = 0 of 


ergospheres, while green (between the ergospheres) and cyan (outside the ergospheres) lines mark gg, = 0. 


where 


b= Va? + 2ace Ne + N2. (22.81) 


Besides the NUT parameter Ne, there is an additional constant, the auxiliary NUT parameter ce- 
The resulting Taub-NUT line-element takes the separable form (22.1), with coefficients as follows. The 


radial and angular parts p, and py of the conformal factor p = ,/p? + pz are 
Px =r =bcot(bz), py=Ne+acosd=N,— ay. (22.82) 


If |N.| < |a|, then there is a ring singularity where the Weyl curvature (22.88) diverges, at pz = py = 0, 
corresponding to r = 0 and cos@ = —N,/a. There is no singularity if |Ne| > |a|. The coefficients wy and wy 
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Rotation 


Figure 22.2 Similar to Figure 22.1, but with ce = 0 instead of ce = —1. Sisytubes encircle both the north (solid cyan 
lines) and south (dashed cyan lines) polar axes. 


are 


b 
We=—>, R=Vr?+2? = —— , wy =asin?6+2N.(c. — cos6) . (22.83) 
R? sin(bx) 

The radius R is everywhere positive, even if b is imaginary, so w, can also be taken to be everywhere positive 


(if a is negative, flip the poles, y + —y, to make a positive). Notice that the NUT parameter N, breaks 
spherical symmetry even if the black hole is non-rotating, a = 0, because w,, equation (22.83), cannot vanish 
at both poles as long as the NUT parameter Ne is non-zero. The combination 1 — wzwy satisfies 


2 


oe (22.84) 
which is always positive. The horizon and polar functions A, and A, are 


1 — WyWy = 


A, = = [r? 2M.r + a? + Q? | Q? N2? ZA (r? + (a Ney) (r? + (a + Ne)?)] , (22.85a) 


Ve sin?o f1 — 1a? Asin?0 + 4AN.(Ne + acos 6)| (22.85b) 
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Poles occur where A, = 0, that is, at 0 = 0 or m. Horizons occur where A, = 0. For vanishing A, there are 
outer and inner horizons at 


re = Me + /M24+ N2 — Q2— Q2 - a? . (22.86) 


The horizon and polar functions A, and A, given by equations (22.85) do not agree with the earlier ex- 
pressions (22.9) for non-zero A and vanishing Ne, but this is not a misprint. The difference arises from an 
arbitrariness in the choice of homogeneous solution (the choice of kg) when a cosmological constant A is 
present, equations (22.51). The tetrad-frame electromagnetic potential A, is 


1 A . O+N. 
aall a jg Se | (22.87) 
p R? V Az ay/ Ay 
The Weyl tensor has only a spin 0 component, and is, generalizing equation (22.12) for A-Kerr-Newman, 
1 2 2 
faem (1 LIN = Lei) (22.88) 
(Px — Ipy) Px + Ipy 


22.7.2 Sisytubes in Taub-NUT 


Sisytubes, containing closed timelike curves, occur in regions where gẹẹ < 0, Exercise 23.4. A sisytube 
encircles any pole where wy fails to vanish, since along poles, from equations (22.4) with A, = 0, 


PU (22.89) 
ae (1—wawy)? ’ i 


which is negative outside the horizon, A; > 0, unless wy vanishes. If the NUT parameter N, is non-zero, 


then generically sisytubes encircle both poles, but for the special cases c, = +1, a sisytube encircles only one 
of the two poles. The conclusion holds even when the black hole spin is zero, a = 0. For A = 0, the sisytube 
tends to a cylinder of constant radius at large distances from the black hole, 


|r| sin + |wy| > |2Ne (ce £ 1)| as r => +o% , (22.90) 


in which the sign of +1 is the sign of y, namely + at the south pole, — at the north pole. 

The critical velocity ve at which there is a closed timelike curve is calculated in Exercise 23.4, equa- 
tion (23.47). The sign of the critical velocity ve is minus the sign of wy along its polar axes, which is the sign 
of =N. (Ce T 1). 

Figure 22.1 illustrates the geometry for an uncharged Kerr-NUT black hole with N, = 0.75Me, Ce = —1, 
and spin a = 1.2M,. Since Ce = —1, a sisytube encircles the north pole but not the south pole. The sign of 


wy is negative along the north pole, so closed timelike curves circulate prograde. 

Figure 22.2 is a Kerr-NUT black hole with the same parameters, except that Ce = 0 in place of ce = —1. 
Here sisytubes enclose both north and south poles. The shapes of the sisytubes differ between north and 
south poles despite ce = 0. The north versus south asymmetry comes from the sign of the NUT parameter 
Ne, which affects the polar function A,, equation (22.85b). Closed timelike curves circulate the north pole 
prograde, the south pole retrograde. 
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22.7.3 Can the auxiliary NUT parameter c, be adjusted by a coordinate 
transformation? 


In §22.6.1 it was seen that, in the separable electrovac spacetimes being considered, any apparent rotation (of 
the tetrad frame through the coordinates) at infinity can be eliminated by a coordinate transformation (22.76) 
of the angular coordinates ¢ and y. 

It might seem that a similar coordinate transformation of the t and x coordinates by 


t=t+uod, dz’ = (1-— wrwo) da , (22.91) 
would bring the line-element to the standard separable electrovac form (22.77) with primed quantities 
Az 
(1 — WW)? i 


The coordinate transformation (22.91) would then allow the auxiliary NUT parameter cẹ to be adjusted 
arbitrarily. For example, Ce could be set to +1, or 0, or whatever other value one might prefer. 

Ordinarily the choice of ce would be dictated by physical reasons, which in the present cause would mean 
the absence of sisytubes. Indeed, a sisytube at the north pole can be eliminated by setting ce = 1; but then 
there is a sisytube at the south pole. Likewise, a sisytube at the south pole can be eliminated by setting 
Ce = —1; but then there is a sisytube at the north pole. One might perhaps choose ce = 0 as the most 
symmetric choice, but this still leaves the north-south asymmetry coming from the sign of N,, as illustrated 
in Figure 22.2. Evidently the problems of the Taub-NUT spacetime are fundamentally topological, and 
unavoidable. 


A= (22.92) 


Ps F 
Wy SWy— Wo, Wy 


Actually, the coordinate transformation (22.91) cannot be made freely, since it already encodes topological 
information. That is, axisymmetric identification ¢ = ġ + 27 at fixed time t differs from axisymmetric 
identification at transformed time t’ = t + wod. Ordinarily the preferred time coordinate would be dictated 
by physical reasons, but again all choices are unphysical. 
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a=0.8 a= 0.96 


OO IE 


Figure 23.1 Silhouettes (black curves) of Kerr black holes with various spin parameters a, from left to right a = 0, 
0.8, 0.96, and 1 (units M = 1), as observed in the equatorial plane from a far distance. The (red) ellipses show the 
horizons of the black holes as an indication of what the black holes would look like without any gravitational lensing. 
The silhouette is compressed on the approaching side and expanded on the receding side. See §23.14. 


In the previous Chapter 22, the form of the Kerr-Newman line-element and its cousins was derived from the 
condition that geodesics are Hamilton-Jacobi separable. In this Chapter, the Hamilton-Jacobi equations are 
separated, and the trajectories of neutral and charged particles in the Kerr-Newman geometry are explored. 


23.1 Hamilton-Jacobi equation 


The Hamilton-Jacobi equation for a particle of mass m and electric charge q in the A-Kerr-Newman geometry 
can be brought to a simple form (23.8) by writing the covariant tetrad-frame momentum px of a particle in 
terms of a set of Hamilton-Jacobi parameters Px, 


1 P, P, P P, 
p= i ioe y, (23.1) 


Vs VAs VA; V, 
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and the covariant tetrad-frame electromagnetic potential A; in terms of a set of Hamilton-Jacobi potentials 


Ak, 


Ak = : A ’ As , Ay ’ as ’ (23.2) 
given by equation (22.10), which in turn follow from equations (22.55) and (22.61), 
or 
Ax = \-% , 0,0, — Qe cos o} , (23.3) 


with Q, and Q, respectively the electric and magnetic charge of the black hole. The contravariant coordinate 
momenta da" /dA = e;"p* are related to the Hamilton-Jacobi parameters P; by 


de® 1f Py wyP P, P 
a { a ae a aaa a7 (23.4) 


ie. 


The tetrad-frame momenta px are related to the generalized momenta m, by Pk = ex" Tk —GAk, which implies 
that the Hamilton-Jacobi parameters Py are related to the canonical momenta 7, by 


P; = Ti + ngwr — GA , (23.5a) 
P, = —Agt,—GAz , (23.5b) 
Py = Ayty — Ay , (23.5c) 
Py = To + Ttwy — gAg - (23.5d) 


Time translation symmetry and axisymmetry imply that 7; and 7,4 are constants of motion, equation (22.21), 
m =-E, m=L. (23.6) 
The separability conditions derived in §22.3 imply that 


P,, P, are functions of x only , 


23. 
Py, P, are functions of y only . aan 
In terms of the Hamilton-Jacobi parameters Pk, the Hamilton-Jacobi equation (22.18) is 
—PP+P?2 PE + PS 22 
g = . 23.8 
tee n mp (23.8) 


Separability for massive particles, m 4 0, requires that the conformal factor p separate as equation (22.2). 
The Hamilton-Jacobi equation (23.8) then separates as 


— P? P2 P2 + P2 
( — = 4 mp2) =Ê i P=K, (23.9) 
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with K a separation constant, the Carter constant. The separated Hamilton-Jacobi equations (23.9) imply 
that 


P, = + P? — (K + m?o?) A, , (23.10a) 


Py = ż,/-P2 + (K — mp2) Ay . (23.10b) 


From the expression (23.4) for the coordinate momenta dæ" /dA, the trajectory of a freely-falling particle 
follows from integrating dy/dx = —P,/P,, equivalent to the implicit equation 


(23.11) 


Again from expression (23.4), the time and azimuthal coordinates t and ¢ along the trajectory are then 
obtained by quadratures, 
_ Pdz _ WyPo dy WrPidx  Pydy 


Sp Ag TA Oe Bh A 


(23.12) 


Again from expression (23.4), the affine parameter À along the trajectory satisfies d\/p? = —dx/P,, = dy/Py, 
so similarly reduces to quadratures, 


2 2 
pade | py dy 
dA = — + ; (23.13) 
P, P; 


In the limiting case of trajectories at constant latitude y, where dy/Py is zero divided by zero, expressions 
for t, ¢, and À along the trajectory are obtained by replacing dy/P, > —dx/P, in equations (23.12) and 
(23.13). Similarly for circular trajectories, where dx/P, is zero divided by zero, expressions for t, ¢, and À 
along the trajectory are obtained by replacing dx/P, > —dy/Py. 


23.2 Particle with magnetic charge 


The above Hamilton-Jacobi equations were for a test particle of mass m and electric charge q, but no magnetic 
charge. Whereas electric charge is a scalar, magnetic charge is a pseudoscalar. Equations of motion for a 
magnetic charge are obtained by taking the Hodge dual of those for an electric charge, effectively swapping 
the roles of the electric and magnetic fields. The Hodge dual of the electromagnetic field (22.11), obtained 
by multiplying by the pseudoscalar J, equation (13.24), is the same expression with the electric Qe and 
magnetic Q, charges of the black hole exchanged according to 


Qe7>-2, QSR. (23.14) 
Coupling the pseudoscalar magnetic charge to the dual electromagnetic field gives an extra minus sign, 
I? = —1. Thus the Hamilton-Jacobi equations generalize to a particle with both electric charge qe and 


magnetic charge qm by replacing 


qQe =} deQe 4 dmQe 3 qQe = deQe — dmQe 5 (23.15) 
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in the expressions (23.5a) and (23.5d) for P, and P. The particle magnetic charge qm is set to zero hereafter; 
it can be reincorporated by making the transformation (23.15) of constants. 


23.3 Killing vectors and Killing tensor 


The Kerr-Newman geometry is stationary and axisymmetric. As such it has two Killing vectors e; and ey, 
§7.32. The symmetries imply conservation of energy E = —7, and azimuthal angular momentum L = 74 of 
freely-falling particles, equations (22.21). 

The separability of the Kerr-Newman geometry means that it also has a Killing tensor K””. The Hamilton- 
Jacobi equation (23.9) can be written in terms of the tetrad-frame momenta pp, equation (23.1), as 


KO OmPn =k; (23.16) 


where K™” is 


K™ = diag (=, , P3, Pz P2) - (23.17) 
The Killing tensor K™” satisfies Killing’s equation 


DikKmn) =0. (23.18) 


23.4 Turnaround 


The squared Hamilton-Jacobi parameters P? and P can be regarded as effective radial and angular poten- 
tials. The coordinates x and y of a freely-falling particle are constrained to move within the regions where 
the potentials P? and P are positive. The trajectory of a freely-falling particle turns around in x where 
P, = 0, and turns around in y where Py = 0. That trajectories turn around at these points can be seen from 
equation (23.11), which with the expressions (23.10) for P, and P, can be written 


dn da dy 


= = (23.19) 
p’ VP? — (K + m?o) Az TE: + (K — m?p?) Ay 


At points where the polar function vanishes, A, = 0, the Hamilton-Jacobi equation (23.9) implies that 
P,=Ps=0 at Ay=0. (23.20) 


Consequently trajectories must turn around in y if they hit A, = 0. Since the Weyl curvature is finite at 
A, = 0, there is no singularity at A, = 0. Rather, the points where A, vanishes define the (north and south) 
poles of the geometry. Trajectories can pass through the poles, but they must turn around in latitude y when 
they do so. 
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Table 23.1: Signs of P; and P, in various regions of the Kerr-Newman geometry 


Region Sign 
Universe, Wormhole, Antiverse PB, <0 
Parallel Universe, Parallel Wormhole, Parallel Antiverse P,>0 
Black Hole P, <0 
White Hole P,>0 
Horizon, Inner Horizon P= P; <0 
Parallel Horizon, Parallel Inner Horizon —-P=P, <0 
Antihorizon, Inner Antihorizon —P,=P,>0 
Parallel Antihorizon, Parallel Inner Antihorizon P=P,>0 


23.5 Constraints on the Hamilton-Jacobi parameters P, and P, 


Horizons divide the spacetime into regions where the Hamilton-Jacobi parameters P, and P, satisfy certain 
conditions. The Hamilton-Jacobi equation (23.8) rearranges to 


P2 + P? 
P- P= (= +m?p?| Az . (23.21) 
y 


This shows that the Hamilton-Jacobi parameters P, and P, must satisfy 


|P: >|P,| if A, >0, 
|P = |P] if Ac =0, (23.22) 
IP:i]<|Pp| if Ay <0. 


The Hamilton-Jacobi parameters must be continuous, including across horizons. Thus P; must have the same 
sign everywhere throughout any connected region where A, is positive, which in the Kerr-Newman geometry 
means either outside the outer horizon or inside the inner horizon. Similarly P, must have the same sign 
everywhere throughout any connected region where A, is negative, which in the Kerr-Newman geometry 
means between the outer and inner horizons. 

Outside the outer horizon, in the Universe region of the Kerr-Newman geometry, Figure 9.6, the time 
parameter P, must be negative, reflecting the fact that the time coordinate t must be timelike and increasing 
with the proper time of any particle. The radial parameter P, can be either positive (outfalling) or negative 
(infalling). 

Inside the outer horizon, in the Black Hole region of the geometry, the radial parameter P, must be 
negative, reflecting the fact that the radius is timelike and decreasing with the proper time of any particle. 
The time parameter P, can be either positive (outgoing) or negative (ingoing). 

Particles that cross the outer horizon are necessarily infalling and ingoing at the horizon, with P; = P, 
negative. The Hamilton-Jacobi parameters are finite and continuous across the horizon. The expression (23.1) 
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shows that the tetrad-frame momenta po and pı are proportional to 1/\/A,, and therefore diverge at the 
horizon, where A, = 0. The divergence is the origin of the inflationary instability at the inner horizon 
discussed in Chapter 24. 

Table 23.1 lists the constraints on the Hamilton-Jacobi parameters P, and P, in each of the regions of the 
Penrose diagram of Figure 9.6. 


23.6 Principal null congruences 


The middle expression of equation (23.9) shows that the Carter constant K is necessarily positive. The 
vanishing of the Carter constant, 


K=0, (23.23) 


defines a special set of geodesics, called the principal outgoing and ingoing null congruences. A con- 
gruence is a space-filling, non-overlapping set of geodesics. The geodesics on the principal congruences are 
null, m = 0, and satisfy 


Py = Py =0. (23.24) 


They further satisfy P? = P?. Outgoing and ingoing geodesics are distinguished by the relative signs of P, 
and P,, 


P,=-—P, outgoing 


23.2 
P; = Ps ingoing . 7320) 


Photons that hold steady on the horizon are members of the outgoing principal null congruence. 
The condition P = 0 implies that the ratio of angular momentum L = mọ to energy E = —7; on the 
principal null congruences is 


L 
Fw. (23.26) 
The affine parameter A along a principal null congruence satisfies 
2d 2d d 
dx LZ- LA L d (23.27) 


~ P/E 1— wry 7 V fog + figo(fi + fowy) ’ 


where f; and g; are the constants of the general electrovac solution, §22.4. As argued in §22.6.1, a coordinate 
transformation allows the constant fo to be set to zero without loss of generality. Thus the affine parameter, 
which is defined only up to a normalization and a shift, along the principal null congruences can be taken 
to be 


Azar. (23.28) 


The line-element (22.1) defines a tetrad (the Boyer-Lindquist tetrad) that is aligned with the principal null 
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congruences. By definition, an object at rest in the tetrad frame has tetrad-frame 4-velocity u™ = {1,0,0,0}. 
The coordinate 4-velocity u” of the tetrad frame through the coordinates is 


1 
uu! =eqt = {1,0,0, wz} . (23.29) 
pVAz 
Thus the principal tetrad frame is at rest in x and y, but rotates through the coordinates at angular velocity 
do/dt = w, about the black hole. 


23.7 Carter integral Q 


It is common to replace the Carter constant K by the Carter integral Q defined by 


P2 
K=04 -Ż , (23.30) 
Ay 
py=0 
which has the property that Q = 0 for orbits in the equatorial plane, py = 0. For A-Kerr-Newman, the 


Carter integral is 
O=K-(L-aE)’. (23.31) 


Exercise 23.1. Boundary of the region between the horizons visible to an infaller at the inner 
horizon. Between the outer and inner horizons, all trajectories must fall inward. What geodesics have the 
largest angular motion between the horizons? Hence determine the boundary of the region between the 
horizons visible to an infaller who reaches the inner horizon. 
Solution. The boundary between regions visible and invisible to an infaller between the horizons is set by 
photons at the border between outgoing and ingoing at the outer horizon, which is set by P; = 0 at r = r4, 
that is, photons with azimuthal angular momentum 
2 2 
fo soe. (23.32) 


We |r=r; a 


Trajectories with the largest angular motion between horizons have infinite Carter constant, 
k=. (23.33) 
The Hamilton-Jacobi solution (23.11) then simplifies to 
dx 2 dy 
=A; 7 JA, ` 


For a Kerr-Newman black hole, the integrals in equation (23.34) are 


dx r—r dy 
= ?2at —”“ =0. 23. 
F aa e JA, (23.35) 


(23.34) 
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outer horizo n 


Invisible 


Figure 23.2 Regions between the outer and inner horizons visible and invisible to an infaller who reaches the inner 
horizon of a Kerr black hole. The black hole here has spin parameter a = 0.8Me, and the infaller (blue line) falls with 
zero angular momentum L = 0 along a trajectory at latitude 6 = 45°. Compare to Figure 7.1 for a Schwarzchild black 
hole. 


The boundary of the region visible to an infaller who reaches the inner horizon at latitude Oop; is 


2 0— Oobs 


r=r_+(r+-—r-)sin 5 f 


(23.36) 
which is illustrated in Figure 23.2. 


Exercise 23.2. Near the Kerr-Newman singularity. This exercise reveals that among ideal black holes, 
the Schwarzschild geometry is exceptional, not typical, in having a gravitationally attractive singularity. 
Explore the behaviour of trajectories of test particles in the vicinity of the Kerr-Newman singularity, where 
p — 0 (that is, where r = 0 and ay = 0). Under what conditions does a test particle reach the singularity? 

1. Argue that for a particle to reach the singularity at y = 0, positivity of P requires that 


Q >00, (23.37) 


where Q is the Carter integral defined by equation (23.31). 
2. Argue that for a particle to reach the singularity at r = 0, positivity of P? requires that 


Q2(L — aE} + (Q2+a7)Q <0. (23.38) 


3. Schwarzschild case: show that if Q, = 0 and a = 0, then a particle reaches the singularity provided that 
the mass of the black hole is positive, Me > 0. 

4. Reissner-Nordstrém case: show that if Q2 > 0 and a = 0, then a particle can reach the singularity only 
if it has zero angular momentum, Q = L = 0, and if the particle’s charge exceeds its mass, 


lal > Iml (23.39) 
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In particular, a neutral particle reaches the singularity only if it has zero angular momentum and is 
massless. 

5. Kerr case: show that if Qe = 0 but a? > 0, then a particle can reach the singularity only if it is moving 
in the equatorial plane (y = 0 and Q = 0), and provided that the mass of the black hole is positive, 
Me > 0. [Hint: Show that if the particle is not already in the equatorial plane at y = 0, then the equation 
of motion for dy/dx shows that the particle never reaches y = 0.| 

6. Kerr-Newman case: show that if Q2 > 0 and a? > 0, then a particle can reach the singularity only if 
L =aE and it is moving in the equatorial plane, and if the particle’s charge-to-mass is large enough, 


Qta 
lal > |m| 4 (23.40) 
Qe 
which generalizes the Reissner-Nordstrém condition (23.39). 
Solution. Equation (23.37) comes from 
P2 + P2 P2 
K=? pm > $ 23.41 
and taking the limit y > 0. Equation (23.38) comes from 
R {- P? + [Q + (L—-aE)? +m] As} =—-RYP? <0, (23.42) 


and taking the limit r —> 0. 


Exercise 23.3. When must t and ¢ progress forwards on a geodesic? Under what circumstances 
must the time coordinate t or azimuthal angle ¢ progress forwards along a geodesic? 
1. Show that, in regions where Ay > 0, 
P2 
SRS- (23.43) 


2 
Fo 
D FA 


Hence show that in the Universe, Wormhole, and Antiverse regions outside the horizons, where P, < 0, 


dt (1 = wrwy)? VK Shaky u 


= 23.44 
N (IS We) Be ey Coe) 
and 
top FC JK&,A 
a (1 = wet) VK KAsÂy ott (23.45) 


dX ~ pt JRA, VAr tun JA, Vae +n JA,” 


Conclude that, in the P; < 0 regions outside the outer and inner horizons, the time coordinate t must 
progress forwards if gs, > 0, which is true outside the sisytube, while the azimuthal angle ¢ must 
progress forwards if g > 0, which is true between the outer and inner ergospheres. 

2. Argue that in the Parallel Universe, Parallel Wormhole, and Parallel Antiverse regions outside the 
horizons, where P, > 0, the inequalities (23.44) and (23.45) hold with the left hand sides replaced by 
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d(—t)/dX and d(—¢)/dA. Hence conclude that the time coordinate t must progress backwards outside 
the sisytube, while the azimuthal coordinate @ must progress backwards between the ergospheres. 


Exercise 23.4. Inside the sisytube. The sisytube, §9.10, is the region where gọ¢ < 0. 

1. Consider a massive particle moving along a circular path at constant radius and latitude (dx = dy = 0), 
with tetrad-frame 4-velocity u* = 7{1,0,0,v}, where y is a Lorentz y-factor and v the corresponding 
3-velocity. A closed timelike curve (CTC) occurs when the time coordinate t is constant along the curve, 
dt/dr = 0. What is the critical velocity ve for a closed timelike curve? Is the closed timelike curve 
prograde or retrograde? What is the condition on the velocity v of the particle for it to go backwards 
in time t? 


2. Can the circular path be a geodesic? 


Solution. 
1. The coordinate 4-velocity uò in terms of the tetrad-frame 4-velocity u* = y{1,0,0, v} is 
dz” y 1 vw w v 
A à, k y z 
w = =e, uk = , 0, 0, 23.46 
ie : i Ve Ji A | eee) 


The particle proceeds forwards or backwards in time t according to the sign of ut. The particle follows 
a closed timelike curve if u’ = 0, which happens when its tetrad-frame velocity v takes the critical value 


v^, 
E (23.47) 


The sisytube condition gy, < 0 ensures that |ve| < 1. The critical velocity equals the speed of light, 
|ve| = 1, at the boundary g,, = 0 of the sisytube. The critical velocity ve, equation (23.47) is negative 
(retrograde) if w, is positive, and positive (prograde) if w, is negative. In A-Kerr-Newman w, is always 
positive, so closed timelike curves in the sisytube are retrograde. The situation with a finite NUT 
parameter NV, has been commented on in §22.7.2. At the critical velocity (23.47), the particle’s azimuthal 
coordinate velocity u® is 


Ue = 


uê = VU wrw)  Tevell = wawy) (23.48) 


PWyV/ Ar pr/ Ay 


whose sign is the same as that of the critical velocity ve. The particle goes backwards in time if the 
absolute value of its velocity exceeds the critical value (23.47), 


|u| > |ve| . (23.49) 
2. No. 


Exercise 23.5. Gédel’s Universe. Gédel’s Universe has a separable line-element of the form (22.1) with 
p=1, Wz = 0, and A, = 1, thus 


d 2 
ds? = — (dt — wy dg)” + dx? + e +A, do? . (23.50) 
y 
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Show that the tetrad-frame energy-momentum tensor is diagonal provided that w, is linear in y. Show that 
the energy-momentum is constant everywhere provided that A, is quadratic in y. Show that the energy- 
momentum takes perfect fluid form with an ultrahard equation of state, Tmn = p{1,1, 1,1}, if 


Wy =2/py, Ay = 2y(1+ py), (23.51) 


in which the constant (0) and linear (2y) terms in A, are chosen so that for y < 1/p the angular part of the 


metric looks like the Minkowski metric in cylindrical coordinates, with y ~ ir?, 


ds? ~ — dt? + dz? + dr? +r°do? fory<1/p. (23.52) 


Show that there is a sisytube (g < 0 and gẹ < 0) for y > 1/p. Is Gédel’s Universe self-consistent in the 
sense that the rest frame of the fluid is everywhere geodesic? Explore Gédel’s Universe. 

Solution. Yes, the solution is self-consistent. The rest frame of the fluid is the same as the rest frame of 
the tetrad, since the energy-momentum is diagonal in the tetrad rest frame. The split between py and py in 
p2t+ ps = p° = 1 can be taken to be py = 1 and py = 0. Rest geodesics satisfy m; = —m, Tg = Mwy, and 
K =0, yielding P, = Py = Py = 0 and P, = —m, whence p! = {m,0,0, 0}. 


23.8 Penrose process 


As first pointed out by Penrose, trajectories in the Kerr-Newman geometry can have negative energy E 
outside the horizon. In Newtonian gravity, gravitational energy is negative. If the gravitational binding 
energy of a particle more than cancels the kinetic energy of the particle, then the particle is in a bound orbit. 
In general relativity, the binding energy of a particle can be so great that in effect it cancels not only the 
kinetic energy, but also the rest mass energy of the particle. Such particles have negative energy. 

It is possible to reduce the mass M, of the black hole by dropping negative energy particles into the black 
hole. This process of extracting mass-energy from the black hole is called the Penrose process. 


Exercise 23.6. Negative energy trajectories outside the horizon. Under what conditions can a test 
particle have negative energy, Æ < 0, outside the outer horizon of a Kerr-Newman black hole? 
1. Argue that the negativity of P, outside the outer horizon implies that aL + qQr must be negative for 
the energy E to be negative. Show that, more stringently, negative E requires that 


2 
aL+qQr < -w (4 + mop?) Ages (23.53) 
y 
2. Argue that for an uncharged particle, q = 0, negative energy trajectories exist only inside the ergosphere. 
3. Do negative energy trajectories exist outside the ergosphere for a charged particle? 
4. For the Penrose process to work, the negative energy particle must fall through the outer horizon, where 
A, = 0. Can this happen? Must it happen? 
Solution. See the end of §23.17. 
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23.9 Constant latitude trajectories in the Kerr-Newman geometry 


For simplicity, the next several sections, up to and including $23.20, are restricted to Kerr-Newman black 
holes with zero magnetic charge, Qe = 0, and zero cosmological constant, A = 0. 
A trajectory is at constant latitude if it is at constant polar angle 6, or equivalently at constant y = — cos 0, 


y = constant . (23.54) 


Constant latitude orbits occur where the angular potential Pis equation (23.10b), not only vanishes, but is 
an extremum, 

2 _ IP) 
Py = ie 0, (23.55) 
the derivative being taken with the constants of motion E, L, and K of the orbit being held fixed. The 
condition P? = 0 sets the value of the Carter integral K. Solving dP? /dy = 0 yields the condition between 
energy E and angular momentum L 


2 L? 

B =£ Ae aes 23.56 
m T sinto ( ) 
Solutions at any polar angle 0 and any angular momentum L exist, ranging from E = +m at L = 0, to 
E = L/(asin?0) at L — +00. The solutions with L = 0 are those of the freely-falling observers that 


define the Doran coordinate system, §9.18. The solutions with L — oo define the principal null congruences 
discussed in §23.6. 


23.10 Circular orbits in the Kerr-Newman geometry 


For simplicity, this section 23.10 is restricted to Kerr-Newman black holes with zero magnetic charge and 
cosmological constant, 


Q.=A=0. (23.57) 


For brevity, the black hole subscripts will be dropped from the black hole mass and electric charge M, and 


Qe, 
M=M, Q=Q. (23.58) 


23.10.1 Condition for a circular orbit 
An orbit can be termed circular if it is at constant radius r, 
r = constant . (23.59) 


It is convenient to call such an orbit circular even if the orbit is at finite inclination (not confined to the 
equatorial plane) about a rotating black hole, and therefore follows the surface of a spheroid (in Boyer- 
Lindquist coordinates). 
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Orbits turn around in r, reaching periapsis or apoapsis, where the radial potential P?, equation (23.10a), 
vanishes. Circular orbits occur where the radial potential P? not only vanishes, but is an extremum, 
2_ dP: 

R= i 0, (23.60) 
the derivative being taken with the constants of motion E, L, and Q of the orbit being held fixed. Circular 
orbits may be either stable or unstable. The stability of a circular orbit is determined by the sign of the 
second derivative of the potential 


dP? 
dr? ’ 

with — for stable, + for unstable circular orbits. Marginally stable orbits occur where d?P?/dr? = 0. 

Circular orbits occur not only in the equatorial plane, but at general inclinations. The inclination of an 


(23.61) 


orbit can be characterized by the maximum latitude ymax, or equivalently the minimum polar angle Omin, 
that the orbit reaches. An astronomer would call arcsin(ymax) = 7/2— Omin the inclination angle of the orbit. 
It is convenient to define an inclination parameter a by 


a= ya = C087 Onin , (23.62) 


which lies in the interval [0, 1]. Equatorial orbits, at y = 0, correspond to œ = 0, while polar orbits, those 


that go over the poles at y = +1, correspond to a = 1. 
The maximum latitude ymax reached by an orbit occurs at the turnaround point P, = 0. Inserting this 
condition into equation (23.10b) allows the Carter constant K, or equivalently the Carter integral Q, equa- 
tion (23.31), to be eliminated in favour of the inclination parameter a, equation (23.62) 
L? 
l-a 


O=K—(L—aBP)* =a |a? (m? — E?) + 


(23.63) 


Equation (23.63) is a quadratic equation in a, so has two roots for a at fixed E, L, and Q. The quadratic 
is O(1 — a) + a(1 — aja? (E? — m?) — aL”, which equals Q at a = 0, and —L? at œ = 1. Therefore there is 
one root in a € [0,1] if Q > 0, and two roots if Q < 0 (given that, for an orbit to exist, at least one root 
must lie in a € [0,1]), 


Q>0 1 root in a€ [0,1], 


23.64 
Q<0 2 roots ina € [0,1]. ne!) 


For one root in a € [0,1], the orbit has only a maximum latitude; for two roots, the orbit has a minimum as 
well as a maximum latitude. All the equations in what follows hold true for a the inclination parameter at 
an extremum, whether maximum or minimum. 

The energy per unit mass of a particle at infinity must exceed its rest mass, |E/m| > 1 (E is positive in 
the Universe, negative in the Parallel Universe). A particle with energy less than its rest mass, |E/m| < 1, 
cannot go to infinity, and is said to be bound. Equation (23.63) implies that the Carter integral Q is positive 
for bound orbits, Q > 0 (with Q = 0 for equatorial orbits, a = 0). Therefore all bound orbits have only a 
maximum latitude; they all pass through the equator. 
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Exercise 23.7. Circular geodesics at constant latitude? Are there circular geodesics at constant 
latitude? 

Solution. Inserting the constant latitude conditions on K and E from 823.9 into the quadratic equa- 
tion (23.63) for the inclination parameter a shows that circular geodesics at constant latitude satisfy Q = 0. 
FIX are those in the equatorial plane, a = 0. 


23.11 General solution for circular orbits 


The general solution for circular orbits of a test particle of arbitrary electric charge q in the Kerr-Newman 
geometry is as follows. For vanishing electric charge, see §23.12. 

The rest mass m of the test particle can be set equal to unity, m = 1, without loss of generality. Circular 
orbits of particles with zero rest mass, m = 0, discussed in §23.13 below, occur in cases where the circular 
orbits for massive particles attain infinite energy and angular momentum. 

In the radial potential P?, equation (23.10a), eliminate the Carter integral K in favour of the inclination 
parameter a using equation (23.63). Furthermore, eliminate the energy E = —7; in favour of P;, equa- 
tion (23.5a). The radial derivatives d"P?/dr” must be taken before E is replaced by P,, since E is a constant 
of motion, whereas P, varies with r. For Kerr-Newman, the expression (23.5a) for P, is 


aL qQr 
R2 
In accordance with Table 23.1, solutions with negative P, correspond to orbits in the Universe, Wormhole, 


(23.65) 


or Antiverse parts of the Kerr-Newman geometry in the Penrose diagram of Figure 9.6, while solutions with 
positive P, correspond to orbits in their Parallel counterparts. If only the Universe region is considered, then 
P, is necessarily negative. By contrast, the energy E can be either positive or negative in the same region 
of the Kerr-Newman geometry (the energy E is negative for orbits of sufficiently large negative angular 
momentum L inside the ergosphere of the Universe). Circular orbits cannot occur between the outer and 
inner horizons (why not?). 

The condition P? = 0, equation (23.60), is a quadratic equation in the azimuthal angular momentum 
L = mọ, whose solutions are 


L R? P2 
ET ETF aVvl—a ( P, + | + E —(r2+a?a)} . (23.66) 


Numerically, it is better to characterize an orbit by L/v1-— a rather than by L itself, since the former 
remains finite as a — 1, whereas L and 1 — a both tend to zero at a — 1. Substituting the two (+) 


expressions (23.66) for L into dP?/dr, and setting the product of the resulting two expressions for dP?/dr 
equal to zero, equation (23.60), yields a quartic equation 


po + piP + p2P? + pP? + paP* = 0}, (23.67) 


23.11 General solution for circular orbits 677 


for the dimensionless quantity P (not to be confused with P, or Pp) defined by 


P; 
R?A, 


P= (23.68) 


The minus sign is introduced so as to make P positive in the region of usual interest, which is the Universe 
region of the Kerr-Newman geometry, where P; is negative (see Table 23.1). The sign of P is always opposite 
to that of P;, since circular orbits exist only where A, > 0, outside horizons. The coefficients p; of the 
quartic (23.67) are 


Po = r° (r? + aa)? , (23.69a 
pı = —2qQr(r? — aa) (r? + a°a) , (23.69b 
p2 = — 2r?(r? +a°a) (r? — 3Mr + 2Q? +a + a?aM/r) + PR (r — aa)? , (23.69¢ 
p3 = 2qQr(r? — a?a)(r? — 3Mr + 2Q? + 2a” — a?a+07aM/r) , (23.69d 
pa = [rf — 6Mr* + (9M?+4Q?+2a7a)r* — 4M (3Q?+a7)r? 

+ (4Q*—6a?aM+4a7Q?+a*a7)r? + 2a?a(2Q?+2a?—a?a)Mr + aa? M?] . (23.69e 


The quartic (23.67) is the condition for an orbit at radius r to be circular. Physical solutions P must be real. 
Barring degenerate cases, the quartic (23.67) has either zero, two, or four real solutions at any one radius r. 
Numerically, it is better to solve the quartic (23.67) for the reciprocal 1/P rather than P, since the vanishing 
of 1/P defines the location of circular orbits of massless particles, §23.13. Roots of the quartic (23.67) as 
a function of radius are illustrated in Figure 23.3 for a charged particle in Kerr-Newman black hole, with 
illustrative values of black hole and particle parameters. 

The azimuthal angular momentum L/\/1— a, energy E, and stability d?P?/dr? of a circular orbit are, in 
terms of a solution P of the quartic (23.67), 


L 1 
= R? P! r? — a?)/r — (R? — 3Mr + 2Q? +a? M/r)P 
E TA. qQ( )/r=( Q /r)P] 
1 
= +——___,/]_,P-14] 1,P+1,P? 23. 
Taare 1 tho Fie + bok , (23.70a) 
E = 3 [P +4Q/r + (1 —- M/r) P] 
1 
=g rag Ve- P + eo eP + eP? , (23.70b) 
d P2 2 
= q-1 P7} + qo + q1P +P’) , (23.70c) 


dr? (r2? + a?a)? ( 
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Figure 23.3 Values of 1/P, equation (23.68), angular momentum L, and energy £, for circular orbits at radius r of a 
charged particle about a Kerr-Newman black hole. The parameters are illustrative: the black hole has spin parameter 
a/M = 0.5 and charge Q/M = 0.5, and the particle has charge-to-mass q/m = 2.4 (so qQ/(mM) = 1.2) on an orbit 
of inclination parameter œ = 0.5. The values 1/P are real roots of the quartic (23.67); generically there are either 
zero, two, or four real roots at any one radius. Solid (green) lines indicate stable orbits; dashed (brown) lines indicate 
unstable orbits. Positive 1/P orbits occur in Universe, Wormhole, and Antiverse regions; negative 1/P orbits occur 
in their Parallel counterparts; zero 1/P orbits are null. The fact that the particle is charged breaks the symmetry 
between positive and negative 1/P. If the charge of the particle were flipped, q/m = —2.4, then the diagrams would 
be reflected about the horizontal axes (the signs of 1/P, E, and L would flip). Orbits are marked p for prograde, r for 
retrograde. In the Universe (r > r+), a positive charge q is repelled by the positive charge Q of the black hole; with 
qQ > mM, as here, the electrical repulsion exceeds the gravitational attraction, and there are no circular orbits at 
large r. Conversely, a negative charge q is attracted by the positive charge Q of the black hole, and there are circular 
orbits at large r. In the Antiverse (r < 0), the situation is symmetrically equivalent to one in which the radius is 
positive and the mass and charge are flipped, transformation (23.77); the positive charge q effectively sees a black 
hole with negative mass —M and negative charge —Q, and is therefore attracted by the charged black hole. Thus in 
the Antiverse, there are circular orbits at large negative r for qQ > mM, as here, but not for qQ < mM. 
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where the coefficients l;, e;, and q; are 


l1 = qQrR?(r? + a?) , (23.71a) 
lo = — R? (r? + a? a)(2Mr — Q’) — Q?’ (rt — ata) , (23.71b) 
= — È [ar — Mr? + 3(Q?+02)r4 — a? (1+0) Mr? + 02(@?-+40Q?+02—a2a)r? 
r 
+ 3ataMr — ata(Q?+a’)] , (23.71¢) 
ly = [3Mr? — 2Q?r? + a?(1+a)Mr — a?(1+a)Q? — ataM/r] RA, , (23.71d) 


e_1 =qQr(r?+a7a), (23.72a) 
e0 = (r? +.a7a)(r? — 2Mr + Q? +a) + Qa , (23.72b) 
e = we [Mr? — (Q? + a? — 2a7a)r? — 3a°aMr + a7a(Q?+a")] , (23.72c) 
e2 = (Mr — Q? — a?aM/r)R*A, , (23.72d) 

and 
q-1 = 2qQr(r? — a?a)(r? + 7a) , (23.73a 
qo = —4(r? +07 0)(Mr? — Q?r? — a°aMr) -—7 OQ? (r* — a)? , (23.73b 


q = _ [r — 4Mr’° + 3(Q?+a?—2a7a)r* + 12a°a Mr’ — a?a(6Q?+6a?—a?a)r? — ata?(Q?+a)] , 
(23.73¢ 
q2 = (3Mr? — 4Q?r? — 6a?aMr — ata? M/r) R*A, . (23.73d 


Equations (23.70) determine the values of L, E, and d?P?/dr? uniquely for any given root P of the quar- 
tic (23.67). The expressions on the second lines of equations (23.70a) for L and equations (23.70b) for E are 
equivalent to the expressions on the first lines, the sign of the second expressions being chosen to agree with 
those of the first expressions. For L, the first expression has the virtue of being unambiguous in sign, while 
the second expression has the virtue of remaining well-behaved in the limit a > 0 or 1 — a — 0. The two 
expressions (23.70a) for L are moreover equivalent to the expression (23.66) with one of the two choices of 
sign in the latter. 

For non-zero a, the reality of a solution P of the quartic (23.67) is a necessary and sufficient condition for a 
corresponding circular orbit to exist. In particular, the argument of the square root in the expression (23.70a) 
for L is guaranteed to be positive. For zero a, however, the quartic (23.67), which reduces in this case to 
the square of a quadratic, §23.20, admits real solutions that do not correspond to a circular orbit. For these 
invalid solutions, the argument of the square root in the second-line expression (23.70a) for L is negative. 
Thus for zero a, a necessary and sufficient condition for a circular orbit to exist is that the solutions for both 
P and L be real. 

Equation (23.70b) shows immediately that circular orbits of neutral (q = 0) particles necessarily have 
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positive energy E in the Universe region outside the horizon, where P > 0 and r > M. It is true, but not so 
obvious, that circular orbits of charged particles (q 4 0) must also have positive energy E in the Universe 
region outside the horizon. As discussed in §23.17, equation (23.100), the circular orbits with the smallest 
possible energy are equatorial orbits at the horizon of an extremal uncharged black hole. 

Also of interest is the derivative a /da of the angular potential at turnaround, where P) = 0. Orbits at 
constant latitude occur where dP? /da vanishes at turnaround. In terms of a solution P of the quartic (23.67), 
the derivative dP? /da is 


da? 1 E i 
where the coefficients k; are 
kı =-qQr(r? +a°a), (23.75a) 
ko = (r? + a? a) (2Mr — Q?) + Q’ (r? — aa), (23.75b) 
k= 1Q [(2r* — 5Mr? + 3(Q? + a? — 2a7a)r? + a?a(3Mr — Q? — a°?)] , (23.75c) 
r 
ko = — (3Mr — 2Q? — a?aM/r) R*A, , (23.75d) 


23.11.1 Discrete symmetries of the orbital structure 


The orbital structure in the Kerr-Newman geometry has two discrete symmetry transformations, parallel 
and radial flips. The parallel flip, which arises from time reversal symmetry t + —t of the Kerr-Newman 
geometry, exchanges Universes, Wormholes, and Antiverses with their Parallel counterparts, 


Po-P, Q-Q, Le-L, Es-E. (23.76) 
The radial flip exchanges Universes and Antiverses, 


r-r, Me-M, Q6-QqQ. (23.77) 


23.11.2 Prograde and retrograde orbits 


At zero spin, a = 0, the quartic (23.67) reduces to the square of a quadratic (this is the Reissner-Nordstr6m 
case considered in §23.20). Each real root P in this case is doubly degenerate. The two roots have opposite 
signs of the angular momentum L/\/1— a. As the spin a is increased away from zero, the two roots for P 
are rotationally split. The root with the more positive angular momentum L (the direction of the axis of the 
black hole being taken so that a is positive) is called prograde, while the root with the more negative angular 
momentum is called retrograde (this is in the Universe, Wormhole, and Antiverse parts of the geometry, 
where P is positive; in their parallel counterparts, the prograde orbit has more negative L, consistent with the 
symmetry transformation (23.76); in all, the prograde orbit is the one with the more positive PaL//1— a). 

Every transition between prograde and retrograde occurs at a double root P of the quartic; but not every 
double root has such a transition. For example, in the charged particle case illustrated in Figure 23.3, in the 
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Universe part of the geometry (P > 0, r > r+), there are two prograde orbits at the same radius at and 
just inside the prograde null circular orbit (1/P — +0); and similarly there are two retrograde orbits at the 
same radius at and just inside the retrograde null circular orbit. 


23.12 Circular geodesics (orbits for particles with zero electric charge) 


Geodesics are trajectories for freely-falling neutral particles, whose motion is influenced only by gravity. For 
a particle with zero electric charge, q = 0, the odd coefficients p; vanish in the quartic condition (23.67) for a 
circular orbit vanish, and the quartic reduces to a quadratic in P?. Solving the quadratic yields two possible 
solutions 

Fy 


1/P? = ——=_ 23.78 
/ r2+azq’ ( ) 


where F4 are 


Fy =r? —3Mr + 2Q? + a2a(1+ M/r) + 2a /(1 — a)(Mr — Q? — a2aM/r) . (23.79) 


with + and — defining respectively prograde and retrograde orbits. By flipping the direction of the rotation 
axis, the spin parameter a can always be chosen to be positive, a > 0. For non-zero spin a Æ 0, the necessary 
and sufficient condition for the existence of a circular orbit is that P be real, which requires that F} be real 
and positive, that is, 


Mr—Q?-a@aM/r>0 and F} >0. (23.80) 


The conditions (23.80) remain necessary and sufficient in the limit a = 0 of zero spin (where P is real even 
without the first of the two conditions (23.80)). For zero electric charge q, the expressions (23.70) for the 
angular momentum L, energy F, and stability d?P?/dr? of a circular orbit, and the expression (23.74) for 
the angular derivative dP? /da of the angular potential, simplify to 


L 1 
yl— a 7 2ayl— a [ 
1 
+= V lo + I2P? , (23.81a) 


“72 + a 
E=%5|[P°+(1-—M/r)P| 


1 
= +——_ Veg + eP? , (23.81b) 


R? Po! — (R? — 3Mr + 2Q? +a°M/r)P] 


“pe +a?a 
d?2P2 2 
o P? 23.81 
dr2 (r? + aa)? (qo q2 ) , ( c) 
dP? 1 
v= (ko +k2P?) . (23.81) 


da r? + a?a 
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Figure 23.4 Location of stable (shaded green) and unstable (shaded amber) circular orbits in a Kerr black hole with 
spin (top) slightly sub-extremal (a = 0.999M/), and (bottom) extremal (a = M). The plotted latitude of each circular 
orbit is its inclination, the maximum latitude reached by the orbit. Null (violet), marginally stable (green), and 
constant-latitude (grey; inside the Antiverse, at r < 0) circular orbits are marked. Regions where circular orbits exist 
are bounded by the two conditions (23.80) (brown and violet). Prograde orbits are drawn to the left of the vertical 
axis, retrograde orbits to the right. Outer and inner ergospheres (dashed, purple), outer and inner horizons (red), 
sisytubes (cyan), and singularities (black) are shown as in Figures 9.1 and 9.3. 
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Figure 23.5 As Figure 23.4, but for a Kerr black hole with spin (top) slightly super-extremal (a = 1.001M), and 
(bottom) super-extremal (a = 1.25M). Ergospheres, sisytubes, and singularities are shown as in Figure 9.4. 


The coefficients l;, e;, and q; from equations (23.71), (23.72), and (23.73) reduce to 
lo = — R? (r? + a7a)(2Mr — Q?) , (23.82a) 
l2 = [3Mr? — 2Q?r? + a?(1+a)Mr — a?(1+a)Q? — ataM/r] R*A, , (23.82b) 
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e0 = (r° + a7a)(r? — 2Mr +Q? +a), (23.83a) 
e2 = (Mr — Q? —a?aM/r)R*A,z , (23.83b) 
and 
go = —4(r? + a?a)(Mr? — Q?r? — a°aMr) , (23.84a) 
q2 = (83Mr? — 4Q?r? — 6a?aMr — ata? M/r) RAs , (23.84b) 
while the coefficients k; from equations (23.75) reduce to 
ko = (r? + a?a)(2Mr — Q?) , (23.85a) 
ko = — (3Mr — 2Q? — aM /r) R'A; . (23.85b) 


Figures 23.4 and 23.5 illustrate the location of stable and unstable circular orbits in the Kerr geometry 
(Q = 0) with sub-extremal and extremal spins (Figure 23.4), and super-extremal spin (Figure 23.5). The 
four spins shown, a/M = 0.999, 1, 1.001, and 1.25, are chosen to bring out how the orbital structure changes 
from sub- to super-extremal. 

The locations of circular orbits are bounded by the two conditions (23.80). The boundaries corresponding 
to the two conditions (23.80) are marked respectively by solid amber and violet lines in Figures 23.4 and 23.5. 
As discussed further in §23.13, the boundary of the second of the two conditions (23.80), F+ = 0, corresponds 
to null circular orbits. 

All circular orbits at r > 0 have positive Carter integral, Q > 0 (with Q = 0 for equatorial orbits), and 
therefore pass through the equator according to condition (23.64). Conversely, all circular orbits at r < 0 
have strictly negative Carter integral Q < 0, and therefore do not pass through the equator: they have both 
a maximum and minimum latitude. 


23.13 Null circular orbits 


Null circular orbits define the photon sphere, marked by solid violet lines in Figures 23.4 and 23.5. Circular 
orbits for massless particles, m = 0, or null circular orbits, follow from the solutions for massive particles 
in the case where the energy and angular momentum on the circular orbit become infinite, which occurs 
when P, + +too. Except at horizons, where A, = 0, this occurs when a solution P = —P,/(R?A,) of 
the quartic (23.67) diverges, which happens when the ratio p4/po of the highest to lowest order coefficients 
vanishes. The ratio p4/po, equations (23.69), factors as 

ps FF 


= 23.86 
Po (r2 + a2a)? ’ ( ) 


where F4 are defined by equation (23.79). A null circular orbit thus occurs at a radius r such that 


F,=0 or F_=0, (23.87) 
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Figure 23.6 Radii of null circular orbits (generalization of the photon sphere) for a Kerr black hole with various 
spin parameters a, including super-extremal spin parameters, |a/M| > 1. Positive and negative a/M signify prograde 
(F+ = 0) and retrograde (F_ = 0) orbits respectively. Lines are labelled with values of the inclination parameter 
a, varying from equatorial orbits (a = 0) to polar orbits (a = 1). Solid lines indicate unstable orbits; dashed lines 
indicate stable orbits; long dashed lines mark the transition between unstable and stable orbits. The radii r_ and r+ 


of the inner and outer horizons are shown for reference. 


with + for prograde (aL > 0) orbits, — for retrograde (aL < 0) orbits. The location of null circular orbits 


are independent of the charge q of the particle, since F4 


- are independent of charge q. 


The condition (23.87) for a photon sphere is a quadratic equation for the inclination parameter a, yielding 


1 i 
JI- = IPT [+ rp] R3 — 2Mry +Q? = 


H \/2Mr3 — (3M? + Q?)r? + 2MQ?rp + a? M? 


. (23.88) 


The photon sphere radius r, ranges over values such that a, € [0,1]. The azimuthal angular momentum 


Jp = L/E per unit energy on the photon sphere is, from equation (23.70a) in the limit P; > 4 


7 R? — 3Mr, +2Q? +a°M/rp 
a(M/rp— 1) 


Jp = 


The Carter constant Kp = K/E on the photon sphere is, equation (23.63), 


7 4r3 (Re —2Mr, + Q?) 


Ky = 


(rp — M}? 


COO, 


(23.89) 


(23.90) 
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In the limit of an extremal Kerr-Newman black hole, the angular momentum (23.89) and Carter con- 
stant (23.90) on the photon sphere reduce to 
—7r2+2Mr,+ a? 
J= P , Rye. (23.91) 


a 


In this case of an extremal Kerr-Newman black hole, there is an additional range of Carter constants Kp for 
the part of the photon sphere that is on the horizon, rp = M, 


J? < Kp < a, (23.92) 


See §23.17 for more on orbits at the horizon of an extremal black hole. 

Figure 23.6 illustrates the radii of null circular orbits for a Kerr (uncharged) black hole, for various spin 
and inclination parameters a and a, including super-extremal (|a/M| > 1) spins. At zero spin, a = 0, a 
Schwarzschild black hole, there is just a single null circular orbit, at r = 3M. For a spinning black hole with 
given positive a/M (a negative a/M can be made positive by flipping the direction of the north pole), there 
are, barring degenerate cases, 2, 4, or 6 distinct null circular orbits at each inclination. At any inclination 
there are always 2 null circular orbits at negative radius, one prograde and one retrograde (in Figure 23.6, 
prograde and retrograde orbits are plotted with a/M respectively positive and negative). In the usual case 
of a sub-extremal (|a/M| < 1) Kerr black hole, there are generally 2 null circular orbits at positive radius, 
one prograde and one retrograde. If the black hole is sufficiently near extremal, then there are a further 2 
null circular orbits at positive radius. If the black hole is sub-extremal (|a/M| < 1), then the additional 2 
orbits exist at small inclinations, a < —3+2,/3 ~ 0.464; the 2 orbits lie between r = 0 and the inner horizon 
r =r_, and are both prograde. If the black hole is super-extremal (|a/M| > 1), then the additional 2 orbits 
exist at large inclinations, a > —3 + 2\/3 ~ 0.464; one orbit is prograde, the other retrograde. 


23.14 The silhouette of a black hole 


An isolated (non-accreting) black hole should appear as a black disk silhouetted against the starry back- 
ground. The edge of the black disk is defined by null circular orbits, the photon sphere, discussed in the 
previous section 23.13. Figure 23.1 illustrates the silhouette of a Kerr black hole for various spin parameters, 
as seen by a distant observer in the equatorial plane. 


23.15 Marginally stable circular orbits 


Figure 23.7 illustrates the radii of marginally stable orbits, those satisfying d?P?/dr? = 0, for a Kerr 
(uncharged) black hole for various spin and inclination parameters a and a. Marginally stable circular orbits 
are marked by solid green lines in Figures 23.4 and 23.5. 
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Figure 23.7 Radii of marginally stable circular orbits for a Kerr black hole with various spin parameters a, including 
super-extremal spin parameters, |a/M| > 1. As in Figure 23.6, positive and negative a/M signify prograde and 
retrograde orbits respectively. Lines are labelled with values of the inclination parameter a, varying from equatorial 
orbits (a = 0) to polar orbits (a = 1). Long dashed lines, which are the same as in Figure 23.6, mark where marginally 
stable orbits become null and terminate, examples of which are illustrated in Figures 23.4 and 23.5. The marginally 
stable equatorial circular orbit (thick black line) is commonly called the ISCO (innermost stable circular orbit) when 
the black hole is sub-extremal and the orbit is prograde, 0 < a/M < 1. The radii r_ and r+ of the inner and outer 
horizons are shown for reference. 


23.16 Circular orbits at constant latitude in the Antiverse 


In the Antiverse (r < 0), there are orbits that are not only circular but also at constant latitude, satisfying 
dP; /da = 0. These orbits are marked by solid grey lines in Figures 23.4 and 23.5. None of these orbits lies 
inside the retrograde sisytube, so all of them progress forwards, not backwards, in Boyer-Lindquist time t. 

As Figures 23.4 and 23.5 show, there are circular orbits that pass through the retrograde sisytube; but 
their back-and-forth motion in latitude takes them in and out of the sisytube. These orbits spend a part of 
their orbit going backwards, and a part going forwards, in Boyer-Lindquist time t. 

In the more general situation of charged particles in spinning charged black holes, do there exist any 
constant-latitude circular orbits that go backwards in Boyer-Lindquist time t? I have not been able to find 
any. Do there exist circular orbits of any kind (not necessarily constant latitude) that go backwards in time 
t? I have not been able to find any. Nevertheless, if a particle is allowed to accelerate arbitrarily, there 
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are trajectories inside the retrograde sisytube that go backwards in Boyer-Lindquist time t, and others on 
which Boyer-Lindquist time t does not change. The latter trajectories, when the azimuthal coordinate ¢ has 
incremented by —2r, constitute Closed Timelike Curves. 


23.17 Circular orbits at the horizon of an extremal black hole 


Away from horizons, the vanishing of F+ defines the location of null circular orbits, §23.13. The case where 
F, vanishes at a horizon (F_ never vanishes at a horizon) is special. This occurs when the black hole is 
extremal, M? = Q? + a?. A circular orbit on the horizon, always prograde, is non-null: if A, = 0, as is true 
on the horizon, then the vanishing of 1/P = —R?A,/P; no longer implies that P, diverges. 

A careful analysis shows that the limiting value of P;/\/A, is finite for a circular orbit at the horizon of 
an extremal black hole, so in fact P, = 0 for such an orbit. Specifically, let P be the dimensionless quantity 


P; 
MVA,z 
For circular orbits on the horizon of an extremal black hole, where r = M and M? = Q? + a?, the quartic 
condition (23.67) reduces to a quadratic 


P 


(23.93) 


P2 +PP +paP? =0, (23.94 
where the coefficients p; are 
P2 = 4a7(1 — a)(M? + aa) + (qQ/M)*(M? — aa)? , (23.95a 
Ps = —2(qQ/M)(M? — a?a)(M? + @°a) , (23.95b 
Ba = (M? + a”)? — a? (1 — a) (6M? +0? +47). (23.95¢ 
The azimuthal angular momentum L, energy E, and stability d?P?/dr? of circular orbits on the horizon are 
i= = > [a M?)(qQ/M) + (M? +a7)P| = +7 ay Vi +1,P +P, (23.96 a) 
E=} G + aQ/M) , (23.96b) 
a =0, (23.96c) 
where the coefficients i. are 
lo = — (M? 077 +270) — qQ? (M^ — ata) , (23.97a) 
lı = qQM(M? + a7)(M? + @°a) , (23.97b) 
l = M?(M? +a?) . (23.97¢) 


Circular orbits on the horizon are always marginally stable, equation (23.96c). Any small perturbation to a 
marginally stable orbit starts it plunging into the unstable side of the orbit. 
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Circular orbits on the horizon occur only for small enough inclinations a. For neutral particles, qQ = 0, 
the coefficient 63 vanishes, and pə is positive, so the quadratic (23.94) has a real root P only as long as p4 
is negative. This imposes the condition that 


< nee, (23.98) 
a : ; 
~ a? (3M +2V2M? + a?) 
For a Kerr (uncharged) black hole, where a = M, the inclination must be less than 
a < —3 + 2V3 = 0.464 , (23.99) 


as illustrated in the bottom panel of Figure 23.4. 

The orbital energy E remains finite for a circular orbit at the horizon of an extremal black hole. An 
interesting case is the circular orbit in the equatorial plane at the horizon of an uncharged extremal (Q = 0, 
a = M) black hole, since this orbit has the smallest possible energy per unit mass among all circular orbits 
in the Universe region (i.e. outside or at the outer horizon) of a Kerr-Newman black hole, 


qQ 1 qQ \” 
B= aes ;+(%) (23.100) 


Won’t qQ vanish if Q = 0? In reality, not necessarily. Real astronomical black holes are almost neutral in part 
because of the enormous charge-to-mass ratio of a proton, €/mp ~ 1018 in Planck units. (Concept question: 
Why?) But the same large charge-to-mass ratio means that qQ could be appreciable in spite of the smallness 
of the black hole charge Q. The smallest possible energy E of a circular orbit occurs as qQ diverges to —oo, 


E->0 asqQ->-oo. (23.101) 
The smallest possible energy for a circular orbit for a neutral particle, q = 0, is 
1 
E=—. (23.102) 


V3 
Of course, there are trajectories with negative energy E in the outer ergosphere, but these trajectories are 


not circular. The absence of circular orbits with negative energies outside or at the outer horizon implies 
that all trajectories with negative energy must fall inside the horizon. 


Concept question 23.8. Are principal null geodesics circular orbits? Outgoing principal null geodesics 
hold steady on the outer horizon, remaining at constant r = r} as time t goes by. Are outgoing principal 

null geodesics therefore null circular orbits on the horizon? Answer. No. The resolution of the conundrum is 

that whereas no Boyer-Lindquist time t passes on a geodesic at the horizon, proper time does pass. An orbit 

is circular if it is so for a massive particle; and a circular orbit is null in the limit of a relativistic massive 

particle. If a massive particle is put on the outer horizon on a relativistic geodesic, then the massive particle 

necessarily falls off the horizon into the black hole in a finite proper time: it is impossible for the geodesic to 

hold steady on the horizon. The exception to circular orbits on the horizon is that, as discussed §23.17, an 

extremal black hole may have circular orbits at its horizon; but these orbits have P, = 0, and are not null. 
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Figure 23.8 Values of the Hamilton-Jacobi parameter P; for circular orbits at radius r in the equatorial plane of a 
near-extremal Kerr black hole, with black hole spin parameter a = 0.999M. The diagram illustrates that as the orbital 
radius r approaches the horizon, P; first approaches zero, but then increases sharply to infinity, corresponding to null 
circular orbits. In the case of an exactly extremal black hole, P; goes as to zero at the horizon, there is no increase 
of P, to infinity, and no null circular orbit. Solid (green) lines indicate stable orbits; dashed (brown) lines indicate 


unstable orbits. 


23.18 Equatorial circular orbits in the Kerr geometry 


The case of greatest practical interest to astrophysicists is that of circular orbits in the equatorial plane of 
an uncharged black hole, the Kerr geometry. 


For circular orbits in the equatorial plane, a = 0, of an uncharged black hole, Q = 0, the solution (23.78) 


for P simplifies to 


where F} 


/PP == 
+, equation (23.79), reduce to 


F} =r? -—3Mr4 


r2 


t 2av Mr , 


(23.103) 


(23.104) 


with + for prograde (aL > 0) orbits, — for retrograde (aL < 0) orbits. 


As discussed in §23.13, null circular orbits occur where F4 
orbit is at the horizon, which occurs when the black hole is 
is near but not exactly extremal, a > |M], null circular o 
(retrograde). For an exactly extremal Kerr black hole, a = 


. = 0, except in the special case that the circular 
extremal. In the limit where the Kerr black hole 
rbits occur at r + M (prograde) and r > 4M 


M|, the (prograde) circular orbit at the horizon 


is no longer null. The situation of a near-extremal Kerr black hole is illustrated by Figure 23.8. 
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Figure 23.9 Energy E and azimuthal angular momentum |L/M| of circular orbits on the ISCO of a Kerr black hole as 
a function of the spin parameter a/M. The angular momentum L is positive for a > 0 (prograde), negative for a < 0 
(retrograde). 


23.18.1 Innermost stable circular orbit (ISCO) 


Astronomers generally argue that the inner edge of an accretion disk is likely to occur at the innermost stable 
equatorial circular orbit, commonly called the ISCO in the literature. An orbit at this point has marginal 
stability, d?P?/dr? = 0. Simplifying the stability d?P?/dr? from equation (23.81c) to the case of equatorial 
orbits, œ = 0, and zero black hole charge, Q = 0, yields the condition of marginal stability 


r? —6Mr —3a7+8aVMr=0. (23.105) 


The + (prograde) orbit has the smaller radius, and so defines the innermost stable circular orbit. For an 
extremal Kerr black hole, a = |M|, marginally stable circular equatorial orbits are at r = M (prograde) and 
r = 9M (retrograde). 

The energy E and angular momentum L of a particle on a marginally stable circular equatorial orbit are 


2M 
B= \/1-—, (23.106a) 
3r 
2M /12r 3r 
L=+ 7T+4 2s 23.106b 
3v3 \ M M ( ) 


which are illustrated in Figure 23.9. 
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Figure 23.10 Efficiency of accretion on to a Kerr black hole, equation (23.107). The efficiency varies from 7 = 0.06 at 
a = 0 to ņ = 0.42 at a = M. 


23.19 Thin disk accretion 


There is a vast observational and theoretical literature on astrophysical accretion flows on to black holes, 
which is beyond the intended scope of this book (see Abramowicz and Fragile (2013) for a review). 

The simplest model of accretion on to a spinning astronomical black hole consists of a thin pressureless 
disk with particles moving on nearly circular orbits in the equatorial plane (Bardeen, 1970). Viscous forces 
cause the particles to spiral slowly inward. Observed accretion rates are orders of magnitude larger than can 
be accounted for by particle viscosity. It is considered likely that the required viscosity arises from turbulence 
driven by the magneto-rotation instability (Balbus and Hawley, 1998; Balbus, 2003). In the simple model, 
upon reaching the ISCO (innermost stable circular orbit), particles fall dynamically on to the black hole 
without further dissipation. 

To spiral inward from large radius, where its energy equals its rest mass, Es = 1, down to the ISCO, 


where Ejgco = 1/1 — 2M/(3r), equation (23.106a), a particle must lose fractional energy 
Eo — EXsco 2M 
= =1 1 í 23.107 
I ES 3r ae 


In the simple thin-disk model, particles in the disk lose energy by emitting radiation, which astronomers 
can detect. The fractional energy 7 represents the efficiency with which rest mass energy is converted to 
radiation. The efficiency n, illustrated in Figure 23.10, varies from 7 = 1 — /8/9 = 0.06 for a non-spinning 
black hole (a = 0) to ņ = 1 — \/1/3 = 0.42 for a maximally spinning black hole (a = M). By comparison, 
nuclear fusion of hydrogen to helium-4 releases 0.007 of the rest mass, while fusion of hydrogen all the way 
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to iron-56, the most tightly bound of all nuclei, releases 0.009 of the rest mass. Thus gravitational accretion 
on to a black hole releases energy more efficiently than fusion, by a factor of 10 or more. This explains 
why gravitational accretion on to black holes can power some of the most luminous objects observed in the 
Universe, such as quasars and gamma-ray bursts. 


23.19.1 Thorne limit 


Accretion from the ISCO increases the angular momentum a of the black hole by the angular-momentum to 
energy ratio L/E of particles on the ISCO. As seen in Figure 23.9, on the ISCO the angular momentum L 
is always greater than M, and the energy E is always less than 1, so the angular momentum L/E per unit 
energy on the ISCO always exceeds M. Therefore accretion from the ISCO tends to spin up a sub-extremal 
black hole towards extremality (Bardeen, 1970). As Thorne (1974) points out, this is problematic because 
an extremal black hole has zero Hawking temperature. Cooling a thermodynamic object to zero temperature 
should be difficult if not impossible. 

For particles to reach the ISCO from far away, they must lose energy. Thorne (1974) remarked that if 
the lost energy is emitted as radiation from a thin equatorial disk, then some of that radiation will be 
absorbed by the black hole, and that radiation will tend to spin down the black hole. Thorne calculated 
that the maximum spin that a black hole accreting from a thin, radiating disk could achieve is a = 0.998M, 
the precise number depending slightly on the directionality of the radiation emitted from the disk (Thorne 
considered isotropic radiation, and electron-scattering dipole radiation). 

Most of the processes that one can think of serve to reduce the angular momentum even further below 
extremality. For example, the gas that accretes on to a supermassive black hole may originate from various 
directions and therefore carry various amounts of azimuthal angular momentum. Although not a rigorous 
limit, the limit of a = 0.998 is often taken by astronomers as a plausible upper bound to the spin of an 
astronomical black hole, the Thorne limit. 


Exercise 23.9. Icarus. In Brian Greene’s story “Icarus,” the boy Icarus goes on a space journey, arrives 
at a black hole, and goes into orbit around it. When he leaves the black hole, he finds that a large time has 
passed in the outside world. Is the story realistic? 

Solution. Equation (23.4) with m = 1 and q = 0 implies that the rate dt/dr at which time t elapses at 
infinity relative to the proper time 7 experienced by Icarus is 


dt 1 P, wy Py 1 2 Ta 
= = R P L-ak 0). 23.108 
( Ay © Ay r? + a? cos? a aE] ( ) 
The first term on the right hand side can become large, with large P, for a circular orbit near the horizon 
of a near-extremal black hole. For such an orbit, the first term in equation (23.108) dominates. For large P, 
equation (23.81b) shows that 


2E 


P x ——. 
1—M/r 


(23.109) 
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A natural strategy is for Icarus to sail in on to the unstable circular orbit with E = 1, since he can manoeuver 
into this orbit, and then out of it, without using much rocket energy. For E = 1, 


2 


P x ——. 
1— M/r 


(23.110) 
Equation (23.110) shows that the closer Icarus can get to r = M, the more rapidly time passes in the outside 
world. Any black hole will not do. Icarus must find himself a rotating black hole that is very close to extremal. 
For a circular orbit in the equatorial plane (a = 0, 0 = 1/2) of a Kerr black hole (Q = 0), the time dilation 
factor (23.108) simplifies to 


dt r+ayM/r (23.111) 
dr Fy í 


For E = 1, equation (23.111) becomes 


dt 2 2 
=1+ N (23.112) 


dr fl—a/M(1+.f1l—a/M) „1I=a/M ` 
At the Thorne limit a = 0.998M, the time dilation factor is 
dt | 
dr 
Exercise 23.10. Interstellar. In the Hollywood movie “Interstellar,” for which Kip Thorne was an Exec- 
utive Producer, the intrepid band of astronauts lands their spacecraft on planet Miller in orbit around the 
black hole Gargantua. For each hour the team spends on planet Miller, seven years pass on the outside. 
That’s a time dilation factor of 60,000. Is it plausible? 
Solution. The situation differs from that in the “Icarus” story in that whereas Icarus can manoeuver his 


44. (23.113) 


rocket into an unstable circular orbit, a planet must be in a stable orbit. The largest time dilation occurs on 
the prograde innermost stable circular orbit in the equatorial plane. For a Kerr black hole (Q = 0), the time 
dilation factor (23.108) on the prograde equatorial ISCO is, to lowest order in 1 — a/M, 


dt 24/3 


x — ; 23.114 
dr v3(1 — a/M)!/3 ( ) 

To achieve the required time dilation factor requires, to lowest order, 

16 
1- a/M x ——, 23.115 
/ 3v3 (dt/dr)3 ( ) 
which for dt/dr ~ 60,000 is 

1—a/M ~ 1071, (23.116) 


or a ~ 0.99999999999999. This is much closer to extremality than the Thorne limit. At the Thorne limit 
a = 0.998 M, the time dilation factor is 
dt 


= =11. (23.117) 
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23.20 Circular orbits in the Reissner-Nordstr6m geometry 


Circular orbits of particles in the Reissner-Nordstr6m geometry follow from those in the Kerr-Newman 
geometry in the limit of a non-rotating black hole, a = 0. For a non-rotating black hole, an orbit can be 
taken without loss of generality to circulate right-handedly in the equatorial plane, 0 = 7/2, so that a = 0 and 
the azimuthal angular momentum L equals the positive total angular momentum Ltot. For non-equatorial 
orbits, the relation between azimuthal and total angular momentum is L = +/1 — a Ltot. 


For a non-rotating black hole, a = 0, the quartic condition (23.67) for a circular orbit of a particle of rest 
mass m = 1 and electric charge q reduces to the square of a quadratic, 


r? — qQrP — (r? — 3Mr + 2Q”) P =0. (23.118) 


Solving the quadratic (23.118) yields two solutions 


pena yt a gece eee (23.119) 


r r2 4r2 


The sign of P, equation (23.68), is positive in the Universe, Wormhole, and Antiverse regions of the Reissner- 
Nordström geometry in the Penrose diagram of Figure 8.6, negative in their Parallel counterparts. The 
angular momentum L, energy E, and stability d?P?/dr? of a circular orbit are, in terms of a solution (23.119 
P of the quadratic, 


L= PPRA, -r?, (23.120a 


PRA, 
= i gee (23.120b 
r T 
2 p2 M 2 
a 2 = 2(r? —6Mr +5Q? + q’Q?) -2 (1 : + ae ) P'RA, . (23.120c 
rT T r 


For massless particles, circular orbits occur where the solution (23.119) for 1/P vanishes, which occurs 
when 


r? —3Mr+2Q? =0, (23.121) 


independent of the charge q of the particle. The condition (23.121) is consistent with the Kerr-Newman 
condition for a null circular orbit, the vanishing of F4 given by equation (23.79). However, for Kerr-Newman, 


the argument of the square root on the right hand side of equation (23.79) for F+ must be positive, even 
in the limit of infinitesimal a. In the limit of small a, this requires that Mr — Q? > 0. If the charge Q of 
the Reissner-Nordstrém black hole lies in the standard range 0 < Q? < M?, then one of the solutions of the 
quadratic (23.121) lies outside the outer horizon, while the other lies between the outer and inner horizons. 
As one might hope, the additional condition Mr — Q? > 0 eliminates the undesirable solution between the 
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horizons, leaving only the solution outside the horizon, which is 


3M 8Q? 
= St (4 yi) for 0A Q' <M. (23.122) 


In (unphysical) cases Q? < 0 or M? < Q? < (9/8)M?, both solutions of equation (23.121) are valid. 


23.21 Hypersurface-orthogonal congruences 


The Hamilton-Jacobi separated solution makes it possible to construct congruences (§18.1) of timelike or 
null geodesics in the A-Kerr-Newman geometry, or more generally in any stationary, axisymmetric, separable 
geometry. Of particular interest are hypersurface-ort hogonal congruences, which were discussed in the context 
of singularity theorems in §§18.6 and 18.7. 

It should be remarked from the outset that the principal null congruences of the A-Kerr-Newman geometry 
are not hypersurface-orthogonal, Exercise 23.11, except in the special case of spherical symmetry. 


23.21.1 Hypersurface-orthogonality condition 


As discussed in §18.6, a timelike hypersurface-orthogonal congruence is constructed by picking an arbitrary 
spacelike 3-dimensional hypersurface on which the action is taken to be constant, and projecting geodesics 
along the direction orthogonal to the hypersurface at each point. The timelike congruence is orthogonal to 
hypersurfaces of constant action. Similarly, as discussed in §18.7, a null hypersurface-orthogonal congruence is 
constructed by foliating an initial 3-dimensional hypersurface into 2-dimensional spatial surfaces of constant 
action, and projecting pairs of outgoing and ingoing null geodesics orthogonally from the 2-surfaces. 

The starting point for constructing timelike or null congruences of geodesics in the A-Kerr-Newman ge- 
ometry is the separated expression (22.22) for the action S' of a single particle, with generalized momenta 
Ta and Ty coming from equations (23.5b) and (23.5c), 


s= f (-Ba + Lap- Z de+ dy) . (23.123) 
Ag Ay 

Equation (23.123) holds for charged as well as uncharged particles, since for Kerr-Newman the components A, 
and A, of the electromagnetic potential vanish, equation (23.3). However, for the remainder of this Chapter, 
the particle will be taken to be uncharged. The Hamilton-Jacobi parameters P, and P}, equations (23.10), 
depend on the particle mass m and on the constants of motion Ca = {E,L,K}. The mass m may be either 
positive or zero. The integrand on the right hand side of equation (23.123) is manifestly integrable, being a 
sum of 4 terms each depending on only one of each of the 4 coordinates t, ¢, x, y. The action (23.123), which 
is that of a single particle with fixed constants of motion Ca, can be extended to a congruence of geodesics as 
long as the integral is understood to be taken along geodesics. The constants of motion Ca are by definition 
constant along each geodesic, but may vary (smoothly) from one geodesic to another. The particle mass m 
can be scaled to a global constant without loss of generality, positive for a timelike congruence, zero for a null 
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congruence. Since the integral (23.123) is along geodesics, and the constants of motion are constant along 
geodesics, the action integrates to the separated expression 


S—S,;=—-E(t—t)+L(¢—- ¢) [z d+ | bay, (23.124) 
Ti T i y 


in which the constants of motion Ca are held constant in the integrals over x and y even when those 
constants vary across geodesics. The constants x” = {t;i, £i, yi, @;} are the values of the coordinates x” on 
some arbitrarily chosen initial 3-dimensional hypersurface from which the geodesics are projected. For a 
timelike congruence the value S; of the action on the initial hypersurface is constant and can be set to zero, 
Si = 0, but for a null congruence the initial action S; must vary over the hypersurface. 

Derivatives of the action S (23.124) with respect to the constants of motion Ca yield comoving spatial 
coordinates X° = {X", XE, X"} defined by 


P,d Pad P,d P 
xe -xPe == f( dt 4 i 5 ee) (tt) + +f oY (93.195a) 
y 


ðE PAs PA z PrAa P ^y 
Os dx dy dx dy 
K_xkK= = = + |= pak de 
fee ae f (5 4) z 2P, a n 2P, ’ ead) 
Os wePdx Psd WP; dx Psd 
L_yla= = = zit pay \ _ ; lt o dy 
Xt XP = = [ ( PA, a) b- ¢ |, PAs [ PA,’ (23.125c) 


where X“ are the (arbitrary) values of the comoving coordinates on the arbitrarily chosen initial 3-dimensional 
hypersurface. As in the action (23.124), the integrals in the definitions (23.125) are to be understood as be- 
ing taken along geodesics. And as in the action (23.124), because the constants of motion are constant 
along geodesics, the coordinates X° integrate to the separated expressions on the rightmost sides of equa- 
tions (23.125) with the constants of motion held constant even when those constants vary across geodesics. 
As is evident from equations (23.11) and (23.12), the comoving coordinates X® are constant along geodesics, 


dX°=0, (23.126) 
justifying their designation as comoving coordinates. The total derivative of the action (23.124) is 
Pa P 
dS = —E dt + L dọ dx + —* dy + dS; + (X®“ — X?) dOa , (23.127) 


As Ay 


in which the penultimate term dS; vanishes for a timelike congruence (where S; is constant), but is non- 
vanishing for a null congruence, and the last term (X° — X“) dCa takes into account the possible variation 
of the constants of motion Ca across geodesics. 

Timelike geodesics are orthogonal to hypersurfaces of constant action if, equation (18.36), 


os 


Equation (23.128) is equivalent to the condition that the total derivative (23.127) of the action is 
Py F, 
dS = — E dt + L dọ — — dz + — dy . (23.129) 
Ay Ay 
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Comparing equations (23.127) and (23.129) shows that a timelike congruence is hypersurface-orthogonal if 
and only if 


(X° — X})dCa =0. (23.130) 


The comoving coordinates defined by equations (23.125) are constant along geodesics, X“ = X?. The 
hypersurface-orthogonal condition (23.130) is then satisfied regardless of whether the constants of motion 
Ca vary across geodesics. 

The condition (23.129) for hypersurface-orthogonality can continue to be imposed in the massless limit, 
where the congruence becomes null. However, for a null congruence the condition (23.129) need not be 
equivalent to the condition (23.128). As discussed in §18.7, in the massless limit the momentum is not 
only orthogonal but also tangent to the limiting null hypersurface, and equation (23.128) need be imposed 
only over each 3-dimensional null hypersurface projected from 2-dimensional surfaces of constant action Sj 
on the initial 3-dimensional hypersurface, not over the entire 4-dimensional spacetime. By definition, the 
initial action S; is constant for each null hypersurface, so dS; = 0 over each null hypersurface. Comparing 
equations (23.127) and (23.129) shows that a null congruence is hypersurface-orthogonal if and only if once 
again the condition (23.130) holds, the same condition as for a timelike congruence. 

For a timelike congruence, the action S and 3 comoving coordinates X®“ can be used, if desired, as 
the 4 coordinates along the congruence. But for a null congruence the action S does not progress along 
worldlines, and the action degenerates to a linear combination of the comoving coordinates X°. Thus for a 
null congruence S and X“ are not 4 independent coordinates. But the difference between the action and the 
linear combination of comoving coordinates, divided by m?, remains finite in the limit m — 0 of zero mass, 
and defines the coordinate X°, 


1 


m 


pas 


2 2d 
= [S — S: — E(X? — XF) - 2K(X* — XF) — L(X* - XP) = ef “ (23.131) 
T Yi y 


zi 
As in the action (23.124) and comoving coordinates (23.125), the integrals on the rightmost side of equa- 
tion (23.131) are to be understood as being taken along geodesics. The variation dX° of the coordinate X° 
equals the variation dA of the affine parameter along geodesics, equation (23.13), 


ds 


m2 


=d. (23.132) 


S — 
dX pe XK,XL ST 
= XE\XK XL 


If desired, the coordinate X° can be used (in place of S) for timelike as well as null congruences. 


23.21.2 Stationary and axisymmetric congruences 


In principle the constants of motion Ca can be chosen arbitrarily across geodesics. But it is natural to consider 
congruences that are stationary and axisymmetric, which requires that the constants Ca be independent of 
time ¢ and azimuthal angle ¢ (but Ca may depend on the radial and latitude coordinates x and y). A 
stationary and axisymmetric congruence can be constructed by starting on an arbitrary 1-dimensional line 
in the x-y plane, and projecting geodesics orthogonally from that 1-dimensional line. The initial action 5; on 
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the 1-dimensional line is constant for a timelike congruence, but varies for a null congruence. The congruence 
is extended to a full congruence in 4 dimensions by translating and rotating it symmetrically in time t and 
azimuth @. 

For a stationary and axisymmetric congruence, the comoving coordinates X” and X+ are Killing coordi- 
nates at fixed x, and y, as follows from 


aX” | xe =~ at 


L 

zyp o X zy Xe 7 doli sy : (23.133) 
If XF and X*+ are to be preserved as Killing coordinates, then in place of x and y it is possible to choose 
any other pair of independent. coordinates that depend only on x and y. A possible choice is X and X°, 
equations (23.125b) and (23.131). 


23.21.3  Hypersurface-orthogonal line-element 


One way to construct a hypersurface-orthogonal line-element is to use coordinates consisting of the action S$ 
and its partial derivatives with respect to the constants of motion, the comoving coordinates X° = 0S/0C*%, 
equations (23.125). The inverse vierbein in terms of these action coordinates {S,0S/0E,0S/0K,0S/0L} is 


P, 1 0 We, 
yrz VAx VAr 
P; P, Vv Ay WP, 


are | a i ae oe (23.134) 
Vay Pyy^y 2P; Pyy 
Pe ae 
Vay vV% Jay 
The vierbein is 
P; mP, mp?/A 2(K — m?p?)P, Tole m Pwy Ar 
A ga Te JA: A a 
P, TP 2(K — m? pz) Pr 14Pr 
1 VAL VAL VAr VAs 
e p= mp] Py Py 2(K + m*p2)P, TPy 
TBs T A F 
Ps mPs MPwiyAy AUKEMA) Pe TPg mp? /A 
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23.21.4 Hypersurface-orthogonal timelike outgoing and ingoing congruences 


Given the symmetries of the A-Kerr-Newman geometry, it is possible to choose timelike or null congruences 
in symmetrically related outgoing (+) and ingoing (—) partners, the actions S for which provide coordinates 


for a line-element (23.141) that describes hypersurface-orthogonal outgoing and ingoing congruences. If the 
action (23.124) describes an outgoing congruence, which is true if the Hamilton-Jacobi parameter P, is 
positive (in the Universe part of the geometry), then a corresponding ingoing congruence can be defined by 
flipping the signs of both P; and Py. 

Define a time coordinate T and a spatial coordinate Z by 


T=E(t-t;)— L($-— ¢i), (23.136a) 
gan airj g dy, (23.136b) 


which are constructed so that the actions S+ for the outgoing and ingoing congruences are 


S+=-TZ. (23.137) 


The quantities x” 


outwards if P, is positive (recall that the radial coordinate x increases inwards). 
The flip in the signs of P, and P, implies that the comoving coordinate X* defined by equation (23.125b) 
differs by a sign flip along outgoing and ingoing geodesics. Consequently the coordinate X” is simultaneously 


are the same for both outgoing and ingoing actions. The spatial coordinate Z increases 


constant along both outgoing and ingoing congruences, allowing the condition X“ = 0 to be imposed 
simultaneously on both outgoing and ingoing congruences. By contrast, as long as x” are the same for both 
outgoing and ingoing actions, as required by the definitions (23.136) of the coordinates T and Z, neither 
X¥ nor X+ can be set simultaneously to zero along both outgoing and ingoing congruences. Therefore the 
hypersurface-orthogonality condition (23.130) can be satisfied simultaneously by both outgoing and ingoing 
timelike congruences only if E and L are constant across geodesics. 

As long as both outgoing and ingoing congruences are hypersurface-orthogonal, which requires that E and 


L be constant, the outgoing and ingoing actions S+, or equivalently T and Z, can be used as coordinates of 


a line-element. For hypersurface-orthogonal congruences, the total derivatives of both outgoing and ingoing 
actions take the form (23.129), and the total derivatives of T and Z are 


dT = Edt—Ldé¢, (23.138a) 
__ Pe Py 
d=- $ d+ F dy . (23.138b) 


The other two coordinates in the line-element of the hypersurface-orthogonal congruences can be taken to 
be ¢ and either X“ if K is constant, or K if K varies. Specifically, 


(23.139) 
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where 


ox A, dx A, dy 
- —_— = 5 23.14 
aK | aps T [ 4P3 aay) 


The right hand side of equation (23.139) reduces to dX* if K is constant, or to —(0X*/OK)dK if K varies. 
The line-element of the hypersurface-orthogonal timelike congruences in terms of coordinates T, Z, ġ, and 
either X“ or K, is 


j P a . es ax” if K constant 
ds? = —_° ___Sa. a, (—C®dT? + dZ?) + 4P?P K 
$ PA PA u( ) u) 28" ae: Wee 
OK 
C? 2 2 2 
E pa oap PA- PAdo + (w PiAy — PoA)dT] z (23.141) 
xWy 


where the coefficient C is 


1/2 1/2 
on (Bat PA a Ay an |, Aa Ay we pigs 
~ (PA, - PA, PPR PUA i" P2A, + P2A; 


which is always positive. For m Æ 0, the coefficient C is less than 1 outside the horizon (A, > 0), equal to 1 
at the horizon (A, = 0), and greater than 1 inside the horizon (A, < 0). 

The line-element (23.141) is in ADM form (17.8). The comoving coordinate X*, the constant of motion K, 
and the one-form in brackets on the second line of the line-element (23.141) all vanish along both outgoing 
and ingoing geodesics. Thus the only part of the line-element (23.141) that varies along geodesics is the part 
proportional to — C?dT? + dZ?. The proper times 7 along the timelike geodesics of the outgoing and ingoing 
congruences satisfy mr = T F Z. 


23.21.5 Double-null hypersurface-orthogonal congruences 


The line-element (23.141) for hypersurface-orthogonal congruences remains well-defined in the limit of zero 
particle mass, m = 0. In the massless limit, the coefficient C, equation (23.142), is unity, 


C=1. (23.143) 


Moreover, for massless particles the energy E can be scaled to +1 without loss of generality, 
|E|=1. (23.144) 


Define outgoing (+) and ingoing (—) null coordinates V+ by 


Vi = T+ Z = — S+ > (23.145) 


which equal minus the action along the opposing null geodesic direction. The V+ null coordinates transform 


into each other under a flip of the signs of the Hamilton-Jacobi parameters Py and Py. If P, is positive, 
then V, is an outgoing null coordinate that increases along the outgoing null congruence, while V_ is an 
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Figure 23.11 Null coordinates V+, equation (23.145), on a spacetime diagram in T and Z. The diagonal grid of 
outgoing null lines, which increase in the outgoing null V} direction and are lines of constant ingoing null coordinate 
V_, are lines of constant phase for an outgoing null wave in the geometric optics (high frequency) limit. 


ingoing null coordinate that increases along the ingoing null congruence. Since the action vanishes along a 
null geodesic, the outgoing null coordinate is constant along the ingoing congruence, while the ingoing null 
coordinate is constant along the outgoing congruence, as illustrated in Figure 23.11. 

For massless particles, the line-element (23.141) in terms of the coordinates V}, V_, ¢, and either X* or 
K, takes the double-null form 


2 P Pan dx* if K constant 
ds? = A, Ay dV_dV3. + 4P2P K 
$ T PPA, + P2, { a + sI Oe E varies 
ôK 
1 2 
‘=e [(P2Ay — PRAs)dd + 3(PrAy — PoAx)(aV+ + aV-)| \ (23.146) 
xWy 


As in the massive case, dX, dK, and the 1-form in brackets on the second line of the line-element (23.146) 
all vanish along both outgoing and ingoing null geodesics. The affine parameter A+ along outgoing (+) or 
ingoing (—) geodesics satisfies 


= prArAy 
2(P2?A; + Peas) 


dN. dV. (23.147) 


Taking the massless limit of the line-element (23.141) does not preserve the condition (23.128) that the 
momenta along geodesics are orthogonal to hypersurfaces of constant action throughout the 4-dimensional 
spacetime. Rather, the massless limit of the condition (23.128) imposes the weaker condition that the mo- 
menta are orthogonal to hypersurfaces of constant action only within those 3-dimensional hypersurface. This 
is precisely the definition of hypersurface-orthogonality for null congruences discussed in §18.7. 
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23.22 The Doran congruence 


Congruences in which geodesics follow lines of constant latitude y are of special interest. Constant latitude 
geodesics must satisfy the two conditions P? = dP?/dy = 0, equations (23.55). These conditions translate 
into two relations between the three constants E, L, and K of motion, which may be expressed for example 


ð, (K — m2p2)A, /3 a |, (K — m2p2)Ay/wy| /3 
K- ma)Ay/ðy a [VE -m Jos] ðu os 


-= i a/o) /ðy 


wy /Oy 
the partial derivatives being taken with K held fixed. For Kerr-Newman without a cosmological constant, 
the conditions (23.148) imply the relation (23.56) between E and L. Generically, the two conditions allow 
at most one combination of E, L, or K to be held constant over spacetime. 


as 


However, as discussed in §23.21.4, congruences that are hypersurface-orthogonal simultaneously in both 
outgoing and ingoing directions can be constructed only if E and L are both constant. For Kerr-Newman 
without a cosmological constant, the relation (23.56) between E and L for constant latitude geodesics admits 
just one solution with both E and L constant, the Doran conditions 


Blom, L=0, K=m@. (23.149) 


For congruences of constant latitude geodesics, where P, vanishes identically, the comoving coordinates 
(23.125) can be evaluated by replacing dy/P, + —dx/P, in the expressions for X” and X+, 


P, Wy Pa \ dx 
E_ Va t y o 
XF =- (t-t) ṣ4 L (ž x \E ; (23.150a) 
dy 
a =| = (23.150b) 
Yi 2Py 
P, P} \ dz 
xt = saat eee l 23.1 
b— ĝi f ( A, Ay) B (23.150c) 


With a suitable choice of boundary conditions, the comoving L coordinate X+ with P, taken negative 
(ingoing congruence) coincides with the angular coordinate dg of the usual Doran metric, equation (9.33), 
X} = og. The expression (23.150b) for the comoving coordinate X appears to diverge, but it appears in 
the hypersurface-orthogonal line-element (23.141) as 2P, dX* —> dy, so the end result is well behaved. 

For the Doran congruence, the time and spatial coordinates T and Z defined by equation (23.136) are 


T=mt, (23.151a) 


P; 
= -f — dr. (23.151b) 


The outgoing and ingoing actions S4 are 


ei eae (« +f E) (23.152) 
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where 6 = P,/m is given by equation (9.35) (with a + sign). As expected, the actions S+ equal —m times 
the proper times along the outgoing and ingoing congruences. 
The line-element (23.141) of the Doran congruences in hypersurface-orthogonal form is 


2 2 
ds? = p? Ez C?dT? + dZ?) + N + es ( $ aa zy , (23.153) 
where the coefficient C is 
P ( BA, M : z pA, Ay l _ (1+ ra panei 
Ay —w2A, Ay — wid, pe 


23.23 Principal null congruences 


The principal null congruences are defined by the Carter constant taking its smallest possible value, zero, 
K = 0, which requires the mass m, and the angular Hamilton-Jacobi parameters Py and Py, all to vanish 
identically. For massless particles the energy E can be scaled to +1 without loss of generality. The condition 
P, = 0 requires that L = Ewy, so L cannot be constant. The hypersurface-orthogonality condition (23.130) 
then holds provided that the comoving coordinate X” is arranged to vanish everywhere. The coordinate X+ 
on the principal null congruences is, equation (23.125c), 


X” =¢- Qi eede (23.155) 


where the + sign is + for the outgoing congruence, — for the ingoing congruence. While X+ can be arranged to 
vanish on one or other of the outgoing or ingoing null congruences, it cannot be made to vanish simultaneously 
on both. Thus although the principal null congruences are geodesic, they cannot be described by a line- 
element that is hypersurface-orthogonal simultaneously on both outgoing and ingoing congruences. According 
to the theorem proved in §18.1, this implies that there is no vorticity-free tetrad that aligns with the principal 
null congruences. Exercise 23.11 explores the vorticity w and other components of the extrinsic curvature 
along the principal null congruences of the A-Kerr-Newman geometry. 


Exercise 23.11. Expansion, vorticity, and shear along the principal null congruences of the 
A-Kerr-Newman geometry. The separable line-element (22.1) defines a tetrad aligned with the principal 
null frame, that is, the tetrad-frame Weyl tensor has only a spin 0 part. The outgoing (v) and ingoing 
(u) null directions lie along the basis elements y, and +, of the corresponding Newman-Penrose tetrad, 
equations (39.1). 
1. Show that the expansion ù, vorticity w, and shear ø along the outgoing (upper sign) and ingoing (lower 
sign) principal null congruences are 


R?,/\Az|(4 pr ti 
[Aalt ps tty) gg, (23.156) 
V2 93 


V+iw=s 
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where the overall sign s is +, except that s is — along the outgoing congruence inside the horizon 
(A, <0). 
2. Define 


2 [A 
At = (py + ips), n , v=ln ( pv Âz ) , po =atan (2) ; (23.157) 
y/ ay 


Show that the following Lorentz transformation of the tetrad 


Yy 1 0 0 0 e” 0 0 0 Vy 
Vu ons A Ae 0 e” 0 0 Vu 

, 23.158 
v4 | | at 0 1 0 0 0 e o 4 ( ) 
y— A 0 0 1 0 0 0 e y 


brings the tetrad to a form parallel-transported along the outgoing principal null direction Yy», with 
vanishing acceleration and precession 


Temy =0 forall km, (23.159) 


and similarly that the Lorentz transformation 


Yv 1 A? à} À- ev 0 0 0 Yv 
Any 0 1 0 0 0 e” 0 0 Yu 

23.1 
y | | @ -a 4 0 0 0 & o y4 e168) 
Ye 0 -Ay 0 1 0 0 0 et y- 


brings the tetrad to a form parallel-transported along the ingoing principal null direction Yu, with 
vanishing acceleration and precession 


Teemu =O forall km. (23.161) 


The rightmost of the two Lorentz transformations in equations (23.158) and (23.160) boosts and rotates 
about the radial direction, leaving the directions of all the null tetrad axes {u, Yu, Y+, Y- } unchanged, 
while the leftmost of the two Lorentz transformations boost-rotates in such a fashion as to leave just the 
outgoing Yy (respectively ingoing Yu) axis unchanged, transforming the remaining axes. The Lorentz- 
transformed frames are no longer principal null. In the outgoing (respectively ingoing) transformed 
frame, the non-vanishing components of the Weyl tensor are its spin 0, —1, and —2 (respectively 0, +1, 
and +2) components. 


23.24 Pretorius-Israel double-null congruence 


Generically, congruences cover only part of the spacetime, and geodesics in the congruence cross. The best 
congruences are those that cover the maximum amount of spacetime, and nowhere cross. The Doran con- 
gruence covers all of spacetime down to the inner horizon and beyond, and crosses nowhere, so provides a 
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Figure 23.12 Null geodesics (blue lines) and surfaces of constant outgoing and ingoing action (or phase) (purple 
lines) in the Pretorius and Israel (1998) double-null hypersurface-orthogonal congruence, for a Kerr black hole with 
a = 0.96M. The coordinates are Boyer-Lindquist, and the units are geometric (c = G = M = 1). The congruence 
covers the entire spacetime at r > 0 without crossing. The first geodesic crossing occurs on the polar axis at r = 0. 
High latitude geodesics cross when they pass through the pole, turning around in latitude; the continuation of these 
geodesics through the pole is shown here to illustrate their future progression. Geodesics at low latitude turn around 
in radius at r < 0. The locus of turnaround points is marked by a thick (green) line, where the geodesics shown here 
are terminated (to avoid cluttering the diagram). Mid-latitude geodesics, after passing through the pole, turn around 
in latitude for a second time. The locus of turnaround points is marked by a continuation of the thick (green) line, 
where the geodesics shown here are terminated (again to avoid cluttering the diagram). Thick (reddish) lines mark 
the outer and inner horizons, and filled circles mark the ring singularity. 


satisfactory example for massive particles. For massless particles, Pretorius and Israel (1998) pointed out 
a double-null hypersurface-orthogonal congruence whose geodesics fill all of Kerr spacetime down to the 
Antiverse (r = 0) without crossing. 

As found in §23.22, there is no double-null hypersurface-orthogonal congruence with P, = 0. As discussed in 
§23.21.4, the hypersurface-orthogonality condition (23.130) can be accomplished simultaneously for outgoing 
and ingoing congruences only if the angular momentum L is constant across geodesics, while the Carter 
constant K may vary across geodesics. For massless particles, the energy E can be scaled without loss of 
generality to +1, with E = +1 in the Universe part of the geometry. 


In the A-Kerr-Newman geometry, motion in latitude extends to the south and north poles only if 
L=0. (23.162) 


In addition, in order to avoid the geodesics turning around in latitude and therefore crossing, the Carter 
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constant must satisfy 
2 
Wy 


Soa 
Ki 


(23.163) 
at all latitudes y on a geodesic. To fill all of the polar region of spacetime, a geodesic that starts at a pole 
must remain on the pole, so it must be that K = 0 at the poles (where wy = 0). Therefore K must vary 
across geodesics in order to satisfy the condition (23.163). Requiring that radial geodesics fall through the 
outer horizon, and consequently also the inner horizon, places an upper limit on K. The deepest penetration 
inside the black hole is attained when K is as small as possible. This leads to the Pretorius and Israel (1998) 
proposal to set K to the smallest value consistent with the condition (23.163). This is achieved by choosing 
K such that Py vanishes at infinite radius, which imposes 


2 
W. 
N y 
K= al: (23.164) 


co 


For Kerr-Newman without a cosmological constant, this is 
K =a? sin? , (23.165) 


where ĝa is the polar angle of the geodesic at infinite radius. Null geodesics with L = 0 and K given by 
equation (23.164) vary in latitude from a minimum latitude that exhausts the condition (23.163) at infinite 
radius. The Pretorius-Israel double null line-element is equation (23.146) with L = 0 and non-constant K 
given by equation (23.164). For E = 1 and L = 0, the time and azimuth Hamilton-Jacobi parameters are 
P, = —1 and Py = —wy. 

Figure 23.12 illustrates the Pretorius-Israel congruence in a Kerr black hole of spin parameter a = 0.96M. 
The outgoing and ingoing congruences lie in, and are orthogonal to, 3-dimensional null hypersurfaces of 
constant action S+ = —T + Z, where the coordinates T and Z are given by equations (23.136). The initial 


3-dimensional hypersurface from which the null hypersurfaces project is a spheroid of constant radius r at 
infinity for the ingoing congruence, and a spheroid of constant radius r at the outer horizon for the outgoing 
congruence. As discussed at the end of §23.21.1, the parameters zi, yi, and ¢; are their values on the initial 
3-dimensional hypersurface, and the time parameter t; is zero. For E = 1 and L = 0, the time coordinate 
T is just the Boyer-Lindquist time coordinate, T = t, and the action on the initial hypersurface is S; = —t. 
The hypersurfaces of constant action, or phase, shown in Figure 23.12 are lines of constant Z at fixed t. 


23.24.1 Application of the null singularity theorem to a A-Kerr-Newman black hole 


Penrose’s (1965) original singularity theorem considered hypersurface-orthogonal null congruences. The Pre- 
torius and Israel (1998) double-null congruence provides an example of such a congruence in the A-Kerr- 
Newman geometry. 

The surfaces of constant action in Figure 23.12 mark the positions of 2-surfaces from which outgoing and 
ingoing geodesics project orthogonally. These 2-surfaces are trapped inside the outer horizon, with negative 
expansion V along both outgoing and ingoing congruences. If the dog-leg proposition (§18.9.1) held, then 
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the future would terminate at the caustic surface of first crossing marked in Figure 23.12 by the upper thick 
(green) line inside the Antiverse (r < 0). However, as discussed in Concept question 18.3, the Kerr-Newman 
geometry does not satisfy the dog-leg proposition (the same holds if A 4 0), so the future extends past the 
caustic crossing. 

The failure of the dog-leg proposition in the A-Kerr-Newman geometry is associated with the fact that 
geodesics can emerge without causal precedent from the ring singularity, leading to a breakdown of pre- 
dictability inside the inner horizon. One should not be surprised that the inner horizon of a real astronomical 
black hole is subject to an instability, the Poisson and Israel (1990) inflationary instability, that inevitably 
and profoundly changes the geometry from just above the inner horizon inward. 


24 


The interiors of rotating black holes 


THIS CHAPTER IS SCARCELY BEGUN 


When a black hole first forms by stellar collapse, or when two black holes merge, the resulting object wob- 
bles about, radiating gravitational waves, settling asymptotically to the Kerr geometry, which cannot radiate 
gravitational waves. After several black hole crossing times, the black hole is already well-approximated by 
the Kerr geometry. 

This picture holds outside the outer horizon, and down to the inner horizon, but it fails dramatically at 
(just above) the inner horizon of the black hole. The inner horizon is subject to the inflationary instability 
discovered by Poisson and Israel (1990). Extended to rotating black holes Barrabés, Israel, and Poisson 
(1990) 

There are also spacetimes in which geodesics of massless particles, but not massive particles, are Hamilton- 
Jacobi separable. Such line-elements are called conformally separable. 


24.1 Nonlinear evolution 


Choose tetrad frame such that the null directions are the geodesic continuations of the outgoing and ingoing 
principal null geodesics, that the blueshift and the rotation of the outgoing and ingoing principal null geodesics 
appears the same in the tetrad frame, 


Ktw K_wy Kyuu Kum Luvs Dreg 0. (24.1) 
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24.2 Focussing along principal null directions 
24.3 Conformally separable geometries 


24.3.1 Conformally separable line-element 


As remarked in §rotbh-chap, the Kerr-Newman line-element has the remarkable mathematical property that 
the equations of motion of test particles in it, massive or massless, neutral or charged, are Hamilton-Jacobi 
separable. A weaker condition on the spacetime is that the equations of motion of massless particles are 
Hamilton-Jacobi separable. 

Among the remarkable mathematical properties of is the fact that, as first shown by Carter (1968), the 
equations of motion of test particles, massive or massless, neutral or charged, are Hamilton-Jacobi separable. 

The Kerr geometry is stationary, axisymmetric, and separable. 

Choose coordinates x” = {t, x, y, 6} in which tis the time with respect to which the spacetime is stationary, 
@ is the azimuthal angle with respect to which the spacetime is axisymmetric, and x and y are radial and 
angular coordinates. In §22.3 it is shown that the line-element may be taken to be 


Az 
(1 — wgrwy)? 


dx? dy? Ay 


As. Ay + (1 — wrwy) 


d? = p? (dt — wy db)” + 5 (dọ — ws dt)’ |, (24.2) 


24.4 Conditions from conformal Hamilton-Jacobi separability 
24.5 Tetrad-frame connections 


Extrinsic curvatures laz along the radial directions z = t and z, and Ty, along the angular directions a = y 
and @. Expansions 


P202 = T'303 = Yo = 0, (24.3a) 
P212 = T313 = 01 = O1 np , (24.3b) 
—To20 = T121 = V2 = O2Inp , (24.3c) 
-Toso = Tis1 = ¥3 = 0, (24.3d) 
Twists 
VAr dwy 
Tr -T Tr 24.4 
320 203 302 = Wo Onl ween) de (24.4a) 
P321 = -T213 =V312 = wg = 0, (24.4b) 
Por = —Pi20 = lozi = w2 = 0, (24.4c) 
JA dwy 
Iiro = -Iret = ltor = wo 2 (24.4d) 
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Shear vanishes 


F202 = V's03 = l212 =V'313 = 0 , (24.5a) 

Doro = Vier = Vos0 = Visi = 0. (24.5b) 

Tazz =0, (24.6a) 

Paa =0, (24.6b) 

Tio00 = Oolnv , (24.6c) 

Pirr = ôi lnv , (24.6d) 

Tioo = OA Inv , (24.6e) 

Pirr = olny , (24.6f) 

y =in (==) ; p= m (=) (24.7) 

p p 


24.6 Inevitability of mass inflation 


Mass inflation requires the simultaneous presence of both outgoing and ingoing streams near the inner 
horizon. Will that happen in real black holes? Any real black hole will of course accrete matter from its 
surroundings, so certainly there will be a stream of one kind or another (outgoing or ingoing) inside the 
black hole. But is it guaranteed that there will also be a stream of the other kind? The answer is probably. 

One of the remarkable features of the mass inflation instability is that, as long as outgoing and ingoing 
streams are both present, the smaller the perturbation the more violent the instability. That is, if say the 
outgoing stream is reduced to a tiny trickle compared to the ingoing stream (or vice versa), then the length 
scale (and time scale) over which mass inflation occurs gets shorter. During mass inflation, as the counter- 
streaming streams drop through an interval Ar of circumferential radius, the interior mass M (r) increases 
exponentially with length scale | 


M(r) x eA"! . (24.8) 
It turns out that the inflationary length scale l is proportional to the accretion rate 
la M , (24.9) 


so that smaller accretion rates produce more violent inflation. Physically, the smaller accretion rate, the 
closer the streams must approach the inner horizon before the pressure of their counter-streaming begins to 
dominate the gravitational force. The distance between the inner horizon and where mass inflation begins 
effectively sets the length scale l of inflation. 

Given this feature of mass inflation, that the tinier the perturbation the more rapid the growth, it seems 
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almost inevitable that mass inflation must occur inside real black holes. Even the tiniest piece of stuff going 
the wrong way is apparently enough to trigger the mass inflation instability. 

One way to avoid mass inflation inside a real black hole is to have a large level of dissipation inside the 
black hole, sufficient to reduce the charge (or spin) to zero near the singularity. In that case the central 
singularity reverts to being spacelike, like the Schwarzschild singularity. While the electrical conductivity of 
a realistic plasma is more than adequate to neutralize a charged black hole, angular momentum transport 
is intrinsically a much weaker process, and it is not clear whether the dissipation of angular momentum 
might be large enough to eliminate the spin near the singularity of a rotating black hole. There has been no 
research on the latter subject. 


24.7 The black hole collider 


A good way to think conceptually about mass inflation is that it acts like a particle accelerator. The counter- 
streaming pressure accelerates outgoing and ingoing streams through each other at an exponential rate, so 
that a Lagrangian gas element spends equal amounts of proper time accelerating through equal decades of 
counter-streaming velocity. The centre of mass energy easily exceeds the Planck energy. 

Mass inflation is expected to occur just above the inner horizon of a black hole. In a realistic rotating 
astronomical black hole, the inner horizon is likely to be at a considerable fraction of the radius of the 
outer horizon. Thus the black hole accelerator operates not near a central singularity, but rather at a 
macroscopically huge scale. This machine is truly monstrous. 

Undoubtedly much fascinating physics occurs in the black hole collider. The situation is far more extreme 
than anywhere else in our Universe today. Who knows what Nature does there? To my knowledge, there has 
been no research on the subject. 


Concept question 24.1. Which Einstein equations are redundant? RE-ASK THIS IN CONTEXT 
OF SPHERICAL MODEL. If 4 of the 10 Einstein equations are redundant (after consistent initial conditions 
are imposed) because of energy-momentum conservation, can any 4 be dropped, or just the 4 with one 
component the time component? 


Exercise 24.2. Can accretion fuel outgoing and ingoing streams at the inner horizon? The 
inflationary instability is driven by outgoing and ingoing streams at the inner horizon. 

1. What are the conditions for collisionless particles accreting from outside the outer horizon to be outgoing 
or ingoing at the inner horizon of a Kerr black hole? 

2. Of particular relevance in astrophysics are collisionless particles that start at effectively infinite radius r, 
whether massless (Cosmic Microwave Background photons) or massive (non-baryonic cold dark matter 
particles). Calculate the maximum latitude to which particles falling from infinite radius can reach and 
be either outgoing or ingoing at the inner horizon. 
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Figure 24.1 Massless (solid) or massive (dashed) collisionless particles that fall from infinite radius can reach the 
inner horizon and be either outgoing or ingoing only up to a certain maximum latitude on the inner horizon, shown 
here. At higher latitudes on the inner horizon, particles that free fall from infinity are necessarily ingoing at the inner 
horizon. The maximum accessible latitude depends on the spin of the black hole. The maximum latitude varies from 
90° (all latitudes are accessible) for a Schwarzschild (non-spinning) black hole, to asin V — 3 + 2V3 = 4299 (m = 0) 
or asin y 1/3 = 35°3 (m = E), arrowed, for an extremal Kerr black hole. 


3. What happens at the poles on the inner horizon? 
Solution. 

1. Particles between the outer and inner horizons are outgoing or ingoing as the Hamilton-Jacobi time 
parameter P;, equation (23.5a), is positive or negative, §23.5. Particles that fall through the outer 
horizon are necessarily ingoing at the outer horizon, requiring P; < 0 at the outer horizon. However, 
particles with sufficiently positive angular momentum L can turn around and become outgoing at the 
inner horizon. The division between outgoing and ingoing at the inner horizon r = r_ occurs when P; is 
zero at the inner horizon. For a Kerr black hole, where P; = — E + Lw,, particles accreted from outside 
the outer horizon are outgoing or ingoing at the inner horizon as (note that wz > w7 > 0) 


Iw, > E> Lw} outgoing , 
(24.10) 
E > max (Lw},Lw}) ingoing . 


2. To reach a given latitude 0 at the inner horizon, Py must be positive, which imposes a lower limit on 
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the Carter constant K, 


P? 
2 2 
K > A, eT py (24.11) 


Particles cannot turn around in radius between the outer and inner horizons. Outside the outer horizon, 
a particle can reach a radius r as long as P, is positive, which imposes an upper limit on the Carter 
constant K, 
P 
K<—4-m' 2. (24.12) 
A; 
A necessary condition for a trajectory to extend to infinite radius is that the particle energy exceed its 
rest mass, Æ > m. Given that condition, the right hand side of equation (24.12) tends to œo at r > r4 
and at r — oo, and is a minimum at some radius in between. The condition that a trajectory can start 
at infinite radius and reach a given latitude inside the outer horizon is 
P? P3 
min (= = mp2) > 2 + mp? , (24.13) 
Ay 
where the minimum on the left hand side is over radius r from r} to oo. The condition (24.13) along 
with the condition that P, = 0 at the inner horizon translates into a condition on the maximum latitude 
at any given spin parameter a, illustrated in Figure 24.1. 
3. Poles occur where A, = 0. A particle can reach a pole only if P) = Py = 0 there, equation (23.20). This 
requires that L = 0 and 


K> mp : (24.14) 


Since L = 0, the time Hamilton-Jacobi parameter P, defined by equation (23.5a) is a constant, P, = 
Ti = —E. Since the sign of P, between the horizons determines whether the particle is outgoing (P, > 0) 
or ingoing (P; < 0), and since a particle falling through the outer horizon is necessarily ingoing, it 
follows that a particle that falls from outside the outer horizon to a pole on the inner horizon must 
remain ingoing. The limiting case is for a massless particle, m = 0, falling along the principal ingoing 


null direction along the polar axis. This polar null geodesic has K P; E L 0. However, 
L/E = wy > 0 on the polar ingoing null geodesic, equation (23.26), which is on the ingoing side of the 
outgoing/ingoing divide (24.10). Thus there are no geodesics that fall from outside the outer horizon 


and are outgoing when they reach a pole on the inner horizon. On the other hand it is possible for 
ingoing photons to scatter off gas or dust inside the outer horizon and thereby become outgoing when 
they reach the inner horizon, at any latitude. 


Exercise 24.3. Inflationary Kasner solution. The inflationary and collapse stages of inflation can be 
approximated by a Kasner line-element (17.133) with two scale factors equal, a2 = as, 


ds? = — dt? + atdz? + a3 (dx3 + dz3) . (24.15) 


All scale factors are functions a,(t) only of time t. The tetrad-frame inflationary energy-momentum is 
diagonal with Too = Ty; and Th2 = T33 = 0. The goal is to find scale factors a, and az that yield such. 
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. Show that the tetrad-frame Einstein tensor that follows from the Kasner line-element (24.15) is diagonal 
with G22 = G33. 
. Define the time T(t) (not to be confused with energy-momenta Tmn) by 


dt = —a°dln|T|. (24.16) 


In the inflationary context, T is negative, varying from —oo in the distant past to —0 at the singularity. 
The minus sign in equation (24.16) ensures that t increases as T increases. Show that the condition 
Goo = G11 requires that az be proportional to some power of |T], 


az x |T|? , (24.17) 


with b some arbitrary constant. 
. Show that the condition Gog = 0 implies that 


(24.18) 


with c some arbitrary constant. 
. Without loss of generality scale the time T so that b = 4 and c = 1. With a convenient scaling of the 
coordinates za, the scale factors aa are 


ITI 
ONE — |ņ]1/2 su 2 _ 73/4, |T| 
a, = ITTA ’ az =|T| 2, a? = a5 = |T|" e! . (24.19) 
There is a BKL bounce where a, goes through its minimum value at T = -}. Show that 
1 e7?lT] 
C= Cs = = . 24.20 
00 11 azar |T}1/2 ( ) 


Show that the only non-vanishing component of the tetrad-frame Weyl tensor is its spin 0 part, 


1 e2IT| 
C= -5 = SITPA (24.21) 
. Define the Kasner coefficients qa by 
ddlnaa 
a= Fina? (24.22) 
which is defined so that 5°, qa = 1. Show that 
m=i 2 
= ; = = ; 24.23 
qı IT| + 3 q2 = 43 IT] +3 ( ) 
with asymptotic behaviour 
{1,0} To-o, 
maad Oh Fo (24.24) 
373 
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Conclude that 


2|T| 

2 

De =] ae (24.25) 
7 (T| +3) 

Geodesics follow from the 3 integrals of motion pa associated with spatial homogeneity, and 1 integral of 

motion p!p,, = —m? associated with conservation of rest mass m. Show that the tetrad-frame Einstein 


tensor can be realised by the sum of energy-momenta of two collisionless streams of massless particles, 
one outgoing (+) and the other ingoing (—) 


Gmn = 8r (Thn 7 Tan) ’ Tinn = NPmPn ? (24.26) 
with tetrad-frame momenta 
1 
pt = —{1,+£1,0,0} , (24.27) 
Qi 


and tetrad-frame number densities n? = Np where N is the scalar density 
o oil 
16ra? ` 


The momenta satisfy the geodesic equation p? Dmp} = 0, and the number densities satisfy number 


(24.28) 


conservation Dmn¥ = 0. 


. The tetrad-frame 4-momentum along a geodesic of a particle of mass m is 


m a p, Pa 


With respect to coordinates z” = {T, £a}, the coordinate 4-momentum along a geodesic is 


dx" dT dza IT| pe Pa 
=< = pnl = Hp™ — Q | 2 , y 
= { athe Em!p [5 By: m?, (24.30) 


Draw null geodesics to see what the scene looks like to an observer at rest in the tetrad frame. 


. Show that the ratio of emitted to observed tetrad-frame frequencies w = p? for an observer at rest at 


time T watching a distant emitter at rest at time T = T,, > —oo in a direction angled 0 away from the 
l-axis (a-axis) is 


Wem w(Too) 3 
= LIT 0. 24.31 
e w(T) > /T/T> sin (24.31) 


The proper time experienced by the rest observer is 7 = t. Conclude that the acceleration of the distant 
emitter perceived by the rest observer is 


dr 2a 
Conclude that the acceleration diverges at the singularity as 


d1n(wem/Wobs 1 -3/467 
pae N(Wem/Wobs) _ = -4T 3/4eTl , (24.32) 


k x -|T| 73/4 x -|r| 7} asr 0. (24.33) 
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Black hole thermodynamics 


For an ideal A-Kerr-Newman black hole, variations of the black hole’s mass M, electric charge Q, angular 
momentum J = aM, and of the cosmological constant A = 87Gp) are related to variations of the area 
A = 4T R? = 4r (r? + a?) of the horizon by 


dM = Č dA+8dQ +wdJ — V dpa , (25.1) 
TE 


where « is the acceleration, ® the electric potential, w the angular velocity, and V the enclosed volume at 


the horizon, 


r-M 2Ar 

K = R2 — 3" 3 (25.2a 

b= 7 , (25.2b 
a 

w= (25.2c 
4 2 

V= zT" R . (25.2d 


Equations (25.1) and (25.2) hold at any horizon, wherever the horizon function A, vanishes, including at a 
cosmological horizon, which exists if the vacuum energy pa, is positive. The acceleration « satisfies 


k = —— . (25.3) 


The acceleration « vanishes when the horizon is extremal, that is, where two horizons merge into one, which 
happens when the horizon function A, is not only zero but also an extremum. 
Equation (25.1) can be recast as 
1 
d(M + paV) = grg %4 T dQ +wdJ — pa dV , (25.4) 
TK 
in which the energy within the horizon is taken to be the energy M + pa V including the contribution from 


vacuum energy. 
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Exercise 25.1. Entropy in Hawking radiation. Compare the entropy emitted in Hawking radiation to 
the Bekenstein-Hawking entropy lost by a black hole when it emits a certain energy dE in Hawking radiation. 
Assume for simplicity that the emitted radiation carries no charge or angular momentum. 

Solution. The entropy S of an ideal gas in thermodynamic equilibrium with zero chemical potential is 
related to its energy E by, equation (30.18), 


TdS 


ae = (1 +/9). (25.5) 


For relativistic radiation, p/p = L, equation (25.6) becomes 


TdS 4 
——— = — 2 . 

dE 3 an) 
If it is assumed that the Hawking radiation carries no charge or angular momentum, then the Bekenstein- 
Hawking entropy lost by the black hole is 


TdS 
ae (25.7) 


Thus the entropy emitted in Hawking radiation exceeds the Bekenstein-Hawking entropy lost by the black 
hole by factor 4. A more careful treatment gives a slightly different result (Zurek, 1982; Page, 1983). 


Exercise 25.2. Area of the horizon. What is the area of the horizon of a stationary black hole? 
Solution. The 2-dimensional angular line-element of the separable line-element (22.1) is 


d? = p’ (25.8) 


dy? (A;-— iag] 
A 


A (1 — wrwy)? 


The angular line-element is diagonal, with proper distances in the two orthogonal y and ¢ directions 
pdy px/Ay — wiA, dọ 
: ; (25.9) 
JA, 1 — WyWy 


The area of the angular y—¢ surface at fixed radius x and time t is obtained by integrating the product of 


the proper distances over the surface, 


p?y/(1 — wyAr/Ay) 
A= if dydg . (25.10) 
l — WyWy 


Black hole thermodynamics 
Horizons occur where the horizon function vanishes, A, = 0, in which case the area simplifies to 
2 
A= l P dyd¢ 
1—wywy 
1 
= 20 / dy 
(fo + fiwe)(fi + fowy) 


z Qn / dwy 
fot fiwe J 24/(fi + fowy)? (gı — gowy) 


2T gı — Gowy 
fit fowy 


(fog + figo)(fo + fiwe) 


719 


(25.11) 


The second line of equations invokes equation (22.39a), while the third line uses equation (22.44b) to trans- 
form the integral over y to an integral over wy. The constants are given by equation (22.72) for A-Kerr- 
Newman, or equation (22.80) for Taub-NUT. The integration over y is from —1 to 1, north to south pole. For 
A-Kerr-Newman, wy = 0 at both poles, but for Taub-NUT, wy = 2N.(c. + 1) at the poles, equation (22.83). 


In either case, for both A-Kerr-Newman and Taub-NUT, the area of the horizon is 
A=4rR’ , 


where R is given by equation (22.7) for A-Kerr-Newman, and equations (22.83) for Taub-NUT. 


(25.12) 


10. 


11. 


12. 


13. 


Concept Questions 


. Why do general relativistic perturbation theory using the tetrad formalism as opposed to the coordinate 


approach? 
Why is the tetrad metric 7 assumed fixed in the presence of perturbations? 


. Are the tetrad axes Ym fixed under a perturbation? 


Is it true that the tetrad components Ymn of a perturbation are (anti-)symmetric in m + n if and only 
if its coordinate components Y, are (anti-)symmetric in u + v? 

Does an unperturbed quantity, such as the unperturbed metric Gu change under an infinitesimal 
coordinate gauge transformation? 

How can the vierbein perturbation Ymn be considered a tetrad tensor field if it changes under an 
infinitesimal coordinate gauge transformation? 

What properties of the unperturbed spacetime allow decomposition of perturbations into independently 
evolving Fourier modes? 

What properties of the unperturbed spacetime allow decomposition of perturbations into independently 
evolving scalar, vector, and tensor modes? 

In what sense do scalar, vector, and tensor modes have spin 0, 1, and 2 respectively? 

Tensor modes represent gravitational waves that, in vacuo, propagate at the speed of light. Do scalar 
and vector modes also propagate at the speed of light in vacuo? If so, do scalar and vector modes also 
constitute gravitational waves? 

If scalar, vector, and tensor modes evolve independently, does that mean that scalar modes can exist 
and evolve in the complete absence of tensor modes? If so, does it mean that scalar modes can propagate 
causally, in vacuo at the speed of light, without any tensor modes being present? 

Equation (27.77) defines the mass M of a body as what a distant observer would measure from its 
gravitational potential. Similarly equation (27.85) defines the angular momentum L of a body as what a 
distant observer would measure from the dragging of inertial frames. In what sense are these definitions 
legitimate? 

Can an observer far from a body detect the difference between the scalar potentials V and ® produced 
by the body? 
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14. If a gravitational wave is a wave of spacetime itself, distorting the very rulers and clocks that measure 
spacetime, how is it possible to measure gravitational waves at all? 

15. If gravitational waves carry energy-momentum, then can gravitational waves be present in a region of 
spacetime with vanishing energy-momentum tensor, Tmn = 0? 

16. Have gravitational waves been detected? 


What’s important? 


. Getting your brain around coordinate and tetrad gauge transformations. 


2. A central aim of general relativistic perturbation theory is to identify the coordinate and tetrad gauge- 


invariant perturbations, since only these have physical meaning. 
. A second central aim is to classify perturbations into independently evolving modes, to the extent that 
this is possible. 

. In background spacetimes with spatial translation and rotation symmetry, which includes Minkowski 
space and the Friedmann-Lemaitre-Robertson- Walker metric of cosmology, modes decompose into inde- 
pendently evolving scalar (spin 0), vector (spin 1), and tensor (spin 2) modes. In background spacetimes 
without spatial translation and rotation symmetry, such as black holes, scalar, vector, and tensor modes 
scatter off the curvature of space, and therefore mix with each other. 

. In background spacetimes with spatial translation and rotation symmetry, there are 6 algebraic com- 
binations of metric coefficients that are coordinate and tetrad gauge-invariant, and therefore represent 
physical perturbations. There are 2 scalar modes, 2 vector modes, and 2 tensor modes. A spin m mode 
varies as e~’"X where y is the rotational angle about the spatial wavevector k of the mode. 

. In background spacetimes without spatial translation and rotation symmetry, the coordinate and tetrad 
gauge-invariant perturbations are not algebraic combinations of the metric coefficients, but rather com- 
binations that involve first and second derivatives of the metric coefficients. Gravitational waves are 
described by the Weyl tensor, which can be decomposed into 5 complex components, with spin 0, +1, 
and +2. The spin +2 components describe propagating gravitational waves, while the spin 0 and spin +1 


components describe the non-propagating gravitational field near a source. 

. The preeminent application of general relativistic perturbation theory is to cosmology. Coupled with 
physics that is either well understood (such as photon-electron scattering) or straightforward to model 
even without a deep understanding (such as the dynamical behaviour of non-baryonic dark matter and 
dark energy), the theory has yielded predictions that are in spectacular agreement with observations 
of fluctuations in the CMB and in the large scale distribution of galaxies and other tracers of the 
distribution of matter in the Universe. 
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Perturbations and gauge transformations 


This Chapter sets up the basic equations that define perturbations to an arbitrary spacetime in the tetrad 
formalism of general relativity, and it examines the effect of tetrad and coordinate gauge transformations 
on those perturbations. The perturbations are supposed to be small, in the sense that quantities quadratic 
in the perturbations can be neglected. The formalism set up in this Chapter provides a foundation used in 
subsequent Chapters. 


26.1 Notation for perturbations 


A o (zero) overscript signifies an unperturbed quantity, while a 1 (one) overscript signifies a perturbation. 
No overscript means the full quantity, including both unperturbed and perturbed parts. An overscript is 
attached only where necessary. Thus if the unperturbed part of a quantity is zero, then no overscript is 
needed, and none is attached. 

The vierbein of the unperturbed background is Py: In this and the next several sections up to and 
including §26.7, the unperturbed vierbein é”” is an arbitrary differentiable function of arbitrary coordinates 
ie 


26.2 Vierbein perturbation 


Let the vierbein perturbation Ymn be defined so that the perturbed inverse vierbein is 


em” = (m + men” |, (26.1) 
with corresponding perturbed vierbein 
ey = (8n = oe : (26.2) 


n 


Since the perturbation Ym” is already of linear order, to linear order its indices can be raised and lowered 


with the unperturbed metric, and transformed between tetrad and coordinate frames with the unperturbed 
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vierbein. In practice it proves convenient to work with the covariant tetrad-frame components Ymn of the 
vierbein perturbation 


Pmn = nim . (26.3) 


In terms of the covariant perturbation Ymn, the perturbed inverse vierbein (26.1) is 


Cm! = (Ymn + Pmn)e”” , (26.4) 


The perturbation Ymn can be regarded as a tetrad tensor field defined on the unperturbed background. 


26.3 Gauge transformations 


The vierbein perturbation Ymn has 16 degrees of freedom, but only 6 of these degrees of freedom correspond 
to real physical perturbations, since 6 degrees of freedom are associated with arbitrary infinitesimal changes 
in the choice of tetrad, which is to say arbitrary infinitesimal Lorentz transformations, and a further 4 degrees 
of freedom are associated with arbitrary infinitesimal changes in the coordinates. 

In the context of perturbation theory, these infinitesimal tetrad and coordinate transformations are called 
gauge transformations. Real physical perturbations are perturbations that are gauge-invariant under 
both tetrad and coordinate gauge transformations. 


26.4 Tetrad metric assumed constant 


In the tetrad formalism, tetrad axes Ym are introduced as locally inertial (or other physically motivated) 
axes attached to an observer. The axes enable quantities to be projected into the frame of the observer. 
In a spacetime buffeted by perturbations, it is natural for an observer to cling to the rock provided by the 
locally inertial (or other) axes, as opposed to allowing the axes to bend with the wind. For example, when 
a gravitational wave goes by, the tidal compression and rarefaction causes the proper distance between two 
freely falling test masses to oscillate, Fig. 27.1. It is natural to choose the tetrad so that it continues to 
measure proper times and distances in the perturbed spacetime. 

In the treatment of general relativistic perturbation theory in this book, the tetrad metric is taken to be 
constant everywhere, and unchanged by a perturbation 


ymn = Ymn = constant |. (26.5) 


For example, if the tetrad is orthonormal, then the tetrad metric is constant, the Minkowski metric nmn- 
However, the tetrad could also be some other tetrad for which the tetrad metric is constant, such as a spin 
tetrad (§38.1), or a Newman-Penrose tetrad (§39.1.1). 
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26.5 Perturbed coordinate metric 
The perturbed coordinate metric is 


Juv = Ymn eee 
= vi (OR, B Pm )e™ (6%, = Gn les 
= Gav — (Cun + Pvy) - (26.6) 


Thus the perturbation of the coordinate metric depends only on the symmetric part of the vierbein pertur- 
bation Ymn, not the antisymmetric part 


Juv == (Gig + a) . (26.7) 


26.6 Tetrad gauge transformations 


Under an infinitesimal tetrad (Lorentz) transformation, the covariant vierbein perturbations Ymn transform 
as 


Pmn = Pmn F Emn ri (26.8) 


where €mn is the generator of a Lorentz transformation, which is to say an arbitrary antisymmetric tensor 
(Exercise 11.2). Thus the antisymmetric part Ymn — Ynm Of the covariant perturbation Ymn is arbitrarily 
adjustable through an infinitesimal tetrad transformation, while the symmetric part Ymn + Ynm is tetrad 
gauge-invariant. 

It is easy to see when a quantity is tetrad gauge-invariant: it is tetrad gauge-invariant if and only if it 
depends only on the symmetric part of the vierbein perturbation, not on the antisymmetric part. Evidently 
the perturbation (26.7) to the coordinate metric g,,, is tetrad gauge-invariant. This is as it should be, since 
the coordinate metric g,,, is a coordinate-frame quantity, independent of the choice of tetrad frame. 

If only tetrad gauge-invariant perturbations are physical, why not just discard tetrad perturbations (the 
antisymmetric part of Ymn) altogether, and work only with the tetrad gauge-invariant part (the symmetric 
part of mn)? The answer is that tetrad-frame quantities such as the tetrad-frame Einstein tensor do change 
under tetrad gauge transformations (infinitesimal Lorentz transformations of the tetrad). It is true that 
the only physical perturbations of the Einstein tensor are those combinations of it that are tetrad gauge- 
invariant. But in order to identify these tetrad gauge-invariant combinations, it is necessary to carry through 
the dependence on the non-tetrad-gauge-invariant part, the antisymmetric part of Ymn- 

Much of the professional literature on general relativistic perturbation theory works with the traditional 
coordinate formalism, as opposed to the tetrad formalism. The term “gauge-invariant” then means coordinate 
gauge-invariant, as opposed to both coordinate and tetrad gauge-invariant. This is fine as far as it goes: the 
coordinate approach is perfectly able to identify physical perturbations versus gauge perturbations. However, 
there still remains the problem of projecting the perturbations into the frame of an observer, so ultimately 
the issue of perturbations of the observer’s frame, tetrad perturbations, must be faced. 
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Concept question 26.1. Non-infinitesimal tetrad transformations in perturbation theory? In 
perturbation theory, can tetrad gauge transformations be non-infinitesimal? 


26.7 Coordinate gauge transformations 
A coordinate gauge transformation is a transformation of the coordinates x” by an infinitesimal shift e” 
ch > rt ag tel” (26.9) 


You should not think of this as shifting the underlying spacetime around; rather, it is just a change of the 
coordinate system, which leaves the underlying spacetime unchanged. Because the shift e” is, like the vierbein 
perturbations Ymn, already of linear order, its indices can be raised and lowered with the unperturbed metric, 
and transformed between coordinate and tetrad frames with the unperturbed vierbein. Thus the shift € can 
be regarded as a vector field defined on the unperturbed background. The tetrad components e” of the shift 
e” are 


Ence (26.10) 
Physically, the tetrad-frame shift e™ is the shift measured in locally inertial coordinates €”, 


ep = gm = gm + e™ ; (26.11) 


26.7.1 The change in any tensor under a coordinate transformation is minus its Lie 
derivative 


As discussed in §7.34, the change in any coordinate tensor Any: (x) under a coordinate gauge transforma- 
tion (26.9) is minus its Lie derivative Le with respect to the infinitesimal shift €, 


ARE (a) + A'N) = ARM (a) = LeAR (26.12) 


HV... fiV: 


The Lie derivative £, A‘). is given by formula (7.151). Under a coordinate gauge transformation (26.9), the 
coordinate of a fixed physical position transforms from x to x’. But in perturbation theory, quantities are 
considered to be functions of coordinate position x, which does not remain at a fixed physical position under 
a coordinate transformation. As discussed in §7.34, the Lie derivative is defined such that the transformed 
tensor A’":--(2) is evaluated at fixed coordinate position a, not at fixed physical position. 


bhis 
26.7.2 Coordinate gauge transformation of a tetrad tensor 


A tetrad-frame 4-vector A™ is a coordinate-invariant quantity, and therefore acts like a coordinate scalar 
under a coordinate gauge transformation (26.9). Thus a tetrad frame 4-vector A™ must be treated as a 
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coordinate scalar when its Lie derivative is taken. Under a coordinate gauge transformation (26.9), a tetrad- 
frame 4-vector A™ transforms as 


A™ (x) + A! (x) = A” (x) — L-A” , (26.13) 
where the Lie derivative is, equation (7.133), 
.OA™ 
L.A” = pi — nota tetrad tensor . (26.14 
x£ 


The change "0, A™ is a coordinate tensor (specifically, a coordinate scalar), but not a tetrad tensor. 
More generally, a tetrad-frame tensor A‘... transforms under a coordinate gauge transformation (26.9 
as 


ANG) > AS (x) = ARS (£) = LARS (26.15 


where the Lie derivative is 


LAM. = 6A, A™-- not a tensor . (26.16 


mn... mn... 


Again, the change —e"0, A‘: is a coordinate tensor (a coordinate scalar), but not a tetrad tensor. 


Concept question 26.2. Should not the Lie derivative of a tetrad tensor be a tetrad tensor? 
The Lie derivative of a tetrad tensor, as defined in this book, is a coordinate tensor but not a tetrad tensor. 
Would it not be better to define the Lie derivative so it is a tetrad tensor as well as a coordinate tensor? 
Answer. In this book, the Lie derivative of any quantity is defined to be minus the variation of the quantity 
under a coordinate transformation. This definition is unambiguous; and it implies that the Lie derivative of 
a tetrad tensor is not a tetrad tensor. 


26.7.3 Coordinate gauge transformation of the vierbein 
The inverse vierbein em” is a coordinate vector and a tetrad vector. It transforms under a coordinate gauge 
transformation (26.9) as 
em” (£) > Em” (£) — Leem" , (26.17) 
where the Lie derivative of the inverse vierbein is, equation (7.137), 
. Oc! 
T K 
Leem” = — em Aah He Fale 
= — m (e™ en) + F Ikem" 


=e (Om En E F dnkm +P dnmk) 


=—e™ [Amen Pel ie | . (26.18) 
On the third line the vierbein derivatives have been replaced by dnkm defined by equation (11.33), while on 
the fourth line [4m is the torsion-free tetrad-frame connection, defined in terms of the vierbein derivatives 
by equation (11.54). 
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26.7.4 Coordinate transformation of the vierbein perturbation 


According to equation (26.1), the perturbation êm” of the inverse vierbein may be expressed in terms of a 
covariant vierbein perturbation field Ymn, 


Em” = Pmn er” : (26.19) 


The perturbation induced by a coordinate gauge transformation (26.9) equals the Lie derivative given by 
equation (26.18), êm!” = Leem”. Consequently the vierbein perturbation Ymn transforms under a coordinate 
gauge transformation (26.9) as 


o 


Pmn Onh = Pmn + mEn + (nem = Dae” ) (26.20) 


with a the torsion-free tetrad-frame connection. This is the fundamental formula that gives the effect of 
coordinate transformations on the vierbein perturbations in any background spacetime. 


Concept question 26.3. Variation of unperturbed quantities under coordinate gauge transfor- 
mations? How does an unperturbed quantity, such as the unperturbed coordinate metric Divs vary under 
an infinitesimal coordinate gauge transformation? Answer. It doesn’t. The variation is considered to be 
part of the perturbation. 


26.8 Scalar, vector, tensor decomposition of perturbations 


In the particular case that the unperturbed spacetime is spatially homogeneous and isotropic, which includes 
not only Minkowski space but also the important case of the cosmological Friedmann-Lemaître-Robertson- 
Walker metric, perturbations decompose into independently evolving scalar (spin 0), vector (spin 1), and 
tensor (spin 2) modes. 

Similarly to Fourier decomposition, decomposition into scalar, vector, and tensor modes is non-local, in 
principle requiring knowledge of perturbation amplitudes simultaneously throughout all of space. In practical 
problems however, an adequate decomposition is possible as long as the scales probed are sufficiently larger 
than the wavelengths of the modes probed. Ultimately, the fact that an adequate decomposition is possible 
is a consequence of the fact that gravitational fluctuations in the real Universe appear to converge at the 
cosmological horizon, so that what happens locally is largely independent of what is happening far away. 


26.8.1 Decomposition of a vector in flat 3D space 


Theorem: In flat 3-dimensional space, a 3-vector field w(x) can be decomposed uniquely (subject to the 
boundary condition that w vanishes sufficiently rapidly at infinity) into a sum of scalar and vector parts 


w= Vu + wii}. (26.21) 


vector 


scalar 
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In this context, the term vector signifies a 3-vector w] that is transverse, that is to say, it has vanishing 
divergence, 


V-w,=0]. (26.22) 


Here V = 0/0x = Va = 0/0x° is the gradient in flat 3D space. The scalar and vector parts are also known 
as spin 0 and spin 1, or gradient and curl, or longitudinal and transverse. The scalar part Vw) contains 1 
degree of freedom, while the vector part w1 contains 2 degrees of freedom. Together they account for the 3 
degrees of freedom of the vector w. 

Proof: Take the divergence of equation (26.21) 


V-w=V*u - (26.23) 
The operator V? on the right hand side of equation (26.23) is the 3D Laplacian. The solution of equa- 


tion (26.23) is 
V’. w(x’) dz’ 
w(x) = J ea dn (26.24) 


The solution (26.24) is valid subject to boundary conditions that the vector w vanish sufficiently rapidly 
at infinity. In cosmology, the required boundary conditions, which are set at the Big Bang, are apparently 
satisfied because fluctuations at the Big Bang were small. Equation (26.21) then immediately implies that 
the vector part is w} = w — Vw]. 
Tt is sometimes convenient to abbreviate Vw = wy (distinguished by bold face wy instead of normal face 
wj), so that the decomposition (26.21) is 
w= wy +w. (26.25) 


scalar Vector 


26.8.2 Fourier version of the decomposition of a vector in flat 3D space 


When the background has some symmetry, it is natural to expand perturbations in eigenmodes of the 
symmetry. If the background space is flat, then it is translation symmetric. Eigenmodes of the translation 
operator V are Fourier modes. 

A function a(a) in flat 3D space and its Fourier transform a(k) are related by (the signs and disposition 
of factors of 27 in the following definition follows the convention most commonly adopted by cosmologists; 
beware that, with the —+++ signature adopted in this book, the convention is opposite to the quantum 
mechanics convention p = hk = —ihV for spatial momentum) 


dèk 
a(k) = [ame da, alx) = faet ; (26.26) 
(27)? 
You may not be familiar with the practice of using the same symbol a in both real and Fourier space; but 
a is the same vector in Hilbert space, with components az = a(a) in real space, and a, = a(k) in Fourier 
space. 
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Taking the gradient V in real space is equivalent to multiplying by —ik in Fourier space 


V > ikl. (26.27) 


Thus the decomposition (26.21) of the 3D vector w translates into Fourier space as 


w=-—ikw + wi , (26.28) 
scalar vector 
where the vector part w satisfies 
k-w i =0. (26.29) 


In other words, in Fourier space the scalar part Vw of the vector w is the part parallel (longitudinal) to 
the wavevector k, while the vector part w1 is the part perpendicular (transverse) to the wavevector k. 


26.8.3 Decomposition of a tensor in flat 3D space 


Similarly, the 9 components of a 3 x 3 spatial matrix ha» can be decomposed into 3 scalars, 2 vectors, and 1 


tensor: 
hab = ab 6 + VaVoh + Eabe Veh + Vaha + Voha + h3, : (26.30) 
scalar scalar scalar vector vector tensor 


In this context, the term tensor signifies a 3 x 3 matrix hie, that is traceless, symmetric, and transverse: 


a0. hie. Vahi (26.31) 


ba > 
The transverse-traceless-symmetric matrix h1, has two degrees of freedom. The vector components ha and 
ha are by definition transverse, 
Vaha = Vaha = 0. (26.32) 
The tildes on hand ha simply distinguish those symbols (from h and ha); the tildes have no other significance. 
The trace of the 3 x 3 matrix ha» is 
ht = 34 + V?R . (26.33) 


27 


Perturbations in a flat space background 


General relativistic perturbation theory is simplest in the case that the unperturbed background space is 


Minkowski space. In Cartesian coordinates x“ = {x°,x',x?,a°} = {t,2,y,z}, the unperturbed coordinate 


metric is the Minkowski metric 
Cig = Nuv . (27.1) 


In this Chapter the tetrad Ym is taken to be orthonormal, and aligned with the unperturbed coordinate axes 
È, so that the unperturbed inverse vierbein is the unit matrix 


êm” = oH. (27.2) 


Let overdot denote partial differentiation with respect to time t, 


o 
overdot = TE (27.3) 
and let V denote the spatial gradient 
o o 
V= = Va = ; 27.4 
ox gg” oe) 
Sometimes it will also be convenient to use Vm to denote the 4-dimensional spacetime derivative 
o 
m = 5 5, i 27. 
Vm= {>V} (27.5) 


27.1 Classification of vierbein perturbations 


The aims of this section are two-fold. First, decompose perturbations into scalar, vector, and tensor parts. 
Second, identify the coordinate and tetrad gauge-invariant perturbations. It will be found, equations (27.13), 
that there are 6 coordinate and tetrad gauge-invariant perturbations, comprising 2 scalars Y and ®, 1 vector 
W, containing 2 degrees of freedom, and 1 tensor ha, containing 2 degrees of freedom. 
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The vierbein perturbations Ymn defined by equation (26.1) decompose, §26.8, into 6 scalars, 4 vectors, 
and 1 tensor, a total of 6+ 4 x 2+1 x 2 = 16 degrees of freedom, 


po= wv , (27.6a 
scalar 

Poa = VaW+ Wa , (27.6b 
scalar vector 

Pad = Vaw T Wa ; (27.6c 
scalar vector 

Lab = Sap P + VaVoh + Eade Veh + Vaho + Vila + hab - (27.6d 
scalar scalar scalar vector vector tensor 


The tildes on w and h simply distinguish those symbols (from w and h); the tildes have no other significance. 
The vector components are by definition transverse (have vanishing divergence), while the tensor component 
hap is by definition traceless, symmetric, and transverse. For a single Fourier mode whose wavevector k is 
taken without loss of generality to lie in the z-direction, so that Vy = Vy = 0, equations (27.6) are 


Y We Wy Vzw 
á üy hiy- Vzh P= Ting Val e 
Vid Veha Vihy +V?h 


To identify coordinate gauge-invariant quantities, it is necessary to consider infinitesimal coordinate gauge 
transformations (26.9). The 4 tetrad-frame components €m of the coordinate shift of the coordinate gauge 
transformation decompose into 2 scalars and 1 vector 


Em ={ & , Vae + €a }. (27.8) 


scalar scalar vector 


In the flat space background space being considered, the coordinate gauge transformation (26.20) of the 
vierbein perturbation simplifies to 


In terms of the scalar, vector, and tensor potentials introduced in equations (27.6), the gauge transforma- 
tions (27.9) are 


poo + Y + éo , (27.10a 
scalar 
Poa > Val(wt €) + (Wa + a) ; (27.10b 
scalar vector 
Pao => Valw T €o) F Wa , (27.10c 
scalar vector 
Pab > Sab P + VaVolh + €) + Eabe Veh + Via lie +€) + Voha + hab . (27.10d 
scalar scalar scalar vector vector tensor 


Equations (27.10a) imply that under an infinitesimal coordinate gauge transformation the potentials trans- 
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form as 


27.11a 
27.11b 
27.11¢ 
27.11d 


prouvtre, 


w> wti, War Wat a 5 


Wr>W+e, Wa > Wa , 


P>, hoh+e, eae ha > ha + €a , ha > ha, hab > hab - 


( ) 
( ) 
( ) 
( ) 


Eliminating the coordinate shift €m from the transformations (27.11) yields 12 coordinate gauge-invariant 
combinations of the potentials 
Poms w—-h, waha, Ča, ©, h , ha, hæ. (27.12) 
scalar scalar vector vector scalar scalar vector tensor 
Physical perturbations are not only coordinate but also tetrad gauge-invariant. A quantity is tetrad gauge- 
invariant if and only if it depends only on the symmetric part of the vierbein perturbations, not on the 
antisymmetric part, §26.6. There are 6 combinations of the coordinate gauge-invariant perturbations (27.12) 
that are symmetric, and therefore not only coordinate but also tetrad gauge-invariant. These 6 coordinate 
and tetrad gauge-invariant perturbations comprise 2 scalars, 1 vector, and 1 tensor 


UW = p-w-wtAl, (27.13a 
scalar 

© |, (27.13b 
scalar 

Wa = Wat t= he ħal, (27.13c 
vector 

hab |. (27.13d 
tensor 


Since only the 6 tetrad and coordinate gauge-invariant potentials Y, 6, Wa, and hay have physical signifi- 
cance, it is legitimate to choose a particular gauge, a set of conditions on the non-gauge-invariant potentials, 
arranged to simplify the equations, or to bring out some physical aspect. Three gauges considered later are 
harmonic gauge (§27.7), Newtonian gauge (§27.8), and synchronous gauge (§27.9). However, for the next 
several sections, no gauge will be chosen: the exposition will continue to be completely general. 


Exercise 27.1. Classification of perturbations in arbitrary dimensions. Classify and enumerate 
general relativistic perturbations in N spacetime dimensions. 

Solution. In N spacetime dimensions, there are N—2 transverse directions. In N spacetime dimensions, the 
vierbein perturbations Ymn, equations (27.6), decompose into: 5 scalars Y, w, Ù, ®, h; 4 vectors Wa, Wa, ho, 
ħa; 1 transverse antisymmetric tensor has (which for N = 4 reduces to a scalar Eabe Veh); and 1 transverse 
traceless symmetric tensor hag; for a total of 5 + 4(N—2) + $(N—2)(N-—3) + 4N(N-3) = N? degrees of 


2 
freedom. Coordinate transformations, equation (27.8), decompose into 2 scalars €o, €, and 1 vector €a, a total 


of 2+ (N—2) = N degrees of freedom, leaving 3 scalars, 3 vectors, 1 antisymmetric tensor, and 1 symmetric 
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tensor. Tetrad (Lorentz) transformations remove a further 1 scalar, 2 vectors, and 1 transverse antisymmetric 
tensor, a total of 1 + 2(N—2) + $(N-—2)(N—3) = $(N-—1)(N—2) degrees of freedom, leaving as physical 
degrees of freedom 2 scalars Y and ®, 1 vector Wa, and 1 transverse traceless symmetric tensor hap, a total 
of 2+ (N—2) + $N(N-3) = $(N-1)(N—2) degrees of freedom. The transverse traceless symmetric tensor 
hap Carries propagating gravitational waves, §27.13. Gravitational waves have iN (N—3) degrees of freedom, 


and exist only in spacetime dimensions N > 4. 


27.2 Metric, tetrad connections, and Einstein and Weyl tensors 


This section gives expressions in a completely general gauge for perturbed quantities in flat background 
Minkowski space. 


27.2.1 Metric 


The unperturbed metric g,» is the Minkowski metric, equation (27.1). The perturbation g,» of the coordinate 
metric is, from equation (26.6), 


Gu =— 2y , (27.14a) 
scalar 
Gta = — Va(w + Ü) — (wa + Üa) , (27.14b) 
scalar vector 
Jab = = Sab 29-2 VaVoh = Valhe + hy) = Vo(ha + ħa) —2 hab (27.14c) 
scalar scalar vector vector tensor 


The coordinate metric is tetrad gauge-invariant, but not coordinate gauge-invariant. 


27.2.2 Tetrad-frame connections 


The tetrad-frame connections Temn can be calculated from the usual formula (11.54). The unperturbed 
tetrad connections Tini all vanish in the flat background. The perturbations e of the tetrad connections 


are 
Diah = Valy = w) = Wa ; (27.15a 
scalar vector 
Tiat = Oab & = VaVo(w im h) = $(VaWe + VWa) + VoWa + hab ; (27.15b 
scalar scalar vector vector tensor 
0 x ee z 

Taso = 3(VaWs = Vi Wa) — <(€ata Vah — Vahu + Viha) ; (27.15¢ 
vector ot scalar vector 

ee — (dbeVa oa dacVb)® = Vk (Eabd Vah = Vah We Volta) + Valve = Vohac . (27.15d 
scalar scalar vector tensor 


The perturbations of the tetrad connections are all coordinate gauge-invariant, as is evident from the fact that 
they depend only on, and on all 12 of, the coordinate gauge-invariant combinations (27.12). The coordinate 
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gauge-invariance of the tetrad connections follows more fundamentally from the fact that any quantity that 
vanishes in the unperturbed background is coordinate gauge-invariant. According to the rule established in 
§26.7, the change in a quantity under an infinitesimal coordinate gauge transformation equals its Lie deriva- 
tive Le with respect to the infinitesimal coordinate shift «. Any quantity that vanishes in the unperturbed 
background has, to linear order, vanishing Lie derivative, therefore is coordinate gauge-invariant. 

However, the perturbations [ym,, of the tetrad connections are not tetrad gauge-invariant, as is evident 
from the fact that they (all) depend on antisymmetric parts of the vierbein perturbations Ymn- 


27.2.3 Tetrad-frame Einstein tensor 


The tetrad-frame Einstein tensor Gmn in perturbed Minkowski space follows from the usual formulae (11.61), 
(11.78), and (11.80). The unperturbed Einstein tensor Gmn vanishes identically. The perturbations Gmn of 
the tetrad-frame Einstein tensor are 


Goo = 2V76), (27.16a) 
scalar 

Goa = 2Vab+1V?Wal, (27.16b) 
scalar vector 

Gab = 2ôab Ë — (VaVo — bavV2)(U — B) + 2 (Va Wo + VsWa) + has | , (27.16c) 
scalar scalar vector tensor 


where O is the d’Alembertian, the 4-dimensional wave operator 


82 
pn m 2 
=VmV =—aatV ; (27.17) 
1 
All the perturbations Gmn of the Einstein tensor are both coordinate and tetrad gauge-invariant, as follows 
from the fact that the expressions (27.16) depend only on the coordinate and tetrad gauge-invariant potentials 
Y, ®, Wa, and hap. The property that the perturbations of the Einstein tensor are coordinate and tetrad 


gauge-invariant is a feature of flat (Minkowski) background spacetime, and does not persist to more general 


spacetimes, such as the Friedmann-Lemaitre-Robertson-Walker spacetime. 
In a frame with the wavevector k taken along the z-axis, so that V = Vy = 0, the perturbations of the 
Einstein tensor are 


2V20 1V2W, LWW, 2V.0 
1V2W, 26+V2(W—6)+Oh h iV, Wa 
m= | ? ( ae * Se Na (27.18) 
i VW, hx 264V2(U—6)-Oh, iV.W, 
2V.0 iV- Wa IVW; 26 


where h, and hx are the two linear polarizations of gravitational waves, discussed further in §27.13, 


hi = hss = — Tey ¢ hx = hey = Pie . (27.19) 
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The tetrad-frame complexified Weyl tensor is 


Coaod = }(Va Vo — $ bavV7)(U + 8) 


scalar 


+ 4 [- (Va We + VoWa) + il€acd Vo + Enca Va) V Wa] 
vector 
+ 4[hab — EacdEvbef VeV ehaf — i(Eacd V chid + EbcdV chaa)] - (27.20) 
tensor 


Like the tetrad-frame Einstein tensor, the tetrad-frame Weyl tensor is both coordinate and tetrad gauge- 
invariant, depending only on the coordinate and tetrad gauge-invariant potentials Y, 6, Wa, and hap. 


27.3 Spin components of the Einstein tensor 


Scalar, vector, and tensor perturbations correspond respectively to perturbations of spin 0, 1, and 2. An object 
has spin s if it is unchanged by a rotation of 27/s about a prescribed direction. In perturbed Minkowski 
space, the prescribed direction is the direction of the wavevector k in the Fourier decomposition of the modes. 
The spin components may be projected out by working in a spin tetrad, §38.1. 

: In a frame where the wavevector k is taken along the z-axis, the spin components of the perturbations 
Ginn of the Einstein tensor (27.16) are 


em = 2V26 ; Cos = 2V.® 5 Gy = 26 ; (27.21a 
spin-0 spin-0 spin-0 
Gu — Gn = Vi 0-9), (27.21b 
spin-0 
+ 12 1 1 , 
Go+ = 5 VW. 4 Gz = 3 Vz: W+ ; (27.21c 
spin-+1 spin-+1 
Gss lei (27.21d 
spin-+2 


where W4 are the spin +1 components of the vector perturbation Wa, 


W+ = (Ws £i Wy), (27.22) 


and hi+ are the spin +2 components of the tensor perturbation hab, 


h hex tiley = h} Łihx . (27.23) 


The spin +2 and —2 components hi, and h__ of the tensor perturbation are called the right- and left- 
handed circular polarizations. The spin +2 and —2 circular polarizations hi, and h__ transform as e~*? 
and e’?X under a right-handed rotation by angle y about the z-axis, while the linear polarizations h} and 


hy transform as cos 2y and — sin 2x. 
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27.4 Too many Einstein equations? 


The Einstein equations are as usual (units c = G = 1; in the remainder of this Chapter, perturbation 
overscripts * on the Einstein and energy-momentum tensors are dropped for brevity, which is fine because 
the unperturbed tensors vanish identically in the Minkowski background) 


Gmn = 8TT mn + (27.24) 


There are 10 Einstein equations, but the Einstein tensor (27.16) depends on only 6 independent potentials: the 
two scalars Y and ®, the vector Wa, and the tensor hay. The system of Einstein equations is thus overcomplete. 
Why? The answer is that 4 of the Einstein equations enforce conservation of energy-momentum, and can 
therefore be considered as governing the evolution of the energy-momentum as opposed to being equations 
for the gravitational potentials. For example, the form of equations (27.16a) and (27.16b) for Goo and Goa 
enforces conservation of energy 


D™Gimo =0, (27.25) 
while the form of equations (27.16b) and (27.16c) for Goa and Ga» enforces conservation of momentum 


Poo, 0. (27.26) 


Normally, the equations governing the evolution of the energy-momentum T'y’” of each species X of mass- 


energy would be set up so as to ensure overall conservation of energy-momentum. If this is done, then 
the conservation equations (27.25) and (27.26) can be regarded as redundant. Since equations (27.25) and 
(27.26) are equations for the time evolution of Goo and Goa, one might think that the Einstein equations 
for Goo and Goq would become redundant, but this is not quite true. In fact the Einstein equations for 
Goo and Goa impose constraints that must be satisfied on the initial spatial hypersurface. Conservation 
of energy-momentum guarantees that those constraints will continue to be satisfied on subsequent spatial 
hypersurfaces, but still the initial conditions must be arranged to satisfy the constraints. Because the Einstein 
equations for Goo and Goa must be satisfied as constraints on the initial conditions, but thereafter can be 
ignored, the equations are called constraint equations. The Einstein equation for Goo is called the energy 
constraint, or Hamiltonian constraint. The Einstein equations for Goa are called the momentum constraints. 


27.5 Action at a distance? 


The tensor component of the Einstein equations shows that, in a vacuum Tmn = 0, the tensor perturbations 
hab propagate at the speed of light, satisfying the wave equation 


hap =0. (27.27) 


The tensor perturbations represent propagating gravitational waves. 
It is to be expected that scalar and vector perturbations would also propagate at the speed of light, yet 
this is not obvious from the form of the Einstein tensor (27.16). Specifically, there are 4 components of the 
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Einstein tensor (27.16) that apparently depend only on spatial derivatives, not on time derivatives. The 4 
corresponding Einstein equations are 


V? = ArT , (27.28a) 
scalar 
V?Wa = 16rToa , (27.28b) 
vector 
V? (Y — P) = — 81QabTab , (27.28c) 
scalar 


where Qa» in equation (27.28c) is the quadrupole operator defined below, equation (27.102). These conditions 
must be satisfied everywhere at every instant of time, giving the impression that signals are travelling 
instantaneously from place to place. 


27.6 Comparison to electromagnetism 


The previous two sections §27.4 and §27.5 brought up two issues: 

1. There are 10 Einstein equations, but only 6 independent gauge-invariant potentials Y, 6, W,, and hap. 
The additional 4 Einstein equations serve to enforce conservation of energy-momentum. 

2. Only 2 of the gauge-invariant potentials, the tensor potentials hap, satisfy causal wave equations. The 
remaining 4 gauge-invariant potentials Y, ®, and W, satisfy equations (27.28) that depend on the 
instantaneous distribution of energy-momentum throughout space, on the face of it violating causality. 

These facts may seem surprising, but in fact the equations of electromagnetism have a similar structure, as 
will now be shown. In this section, the spacetime is assumed for simplicity to be flat Minkowski space. The 
discussion in this section is based in part on the exposition by Bertschinger (1993). 

In accordance with the usual procedure, the electromagnetic field may be defined in terms of an elec- 
tromagnetic 4-potential A™, whose time and spatial parts constitute the scalar potential ¢ and the vector 
potential A: 


A” = {¢, A} . (27.29) 


In flat (Minkowski) space, the electric and magnetic fields E and B are defined in terms of the potentials ¢ 
and A by 

_ OA 

Sevp= i 
B=VxA. (27.30b) 


(27.30a) 


Given their definition (27.30), the electric and magnetic fields automatically satisfy the two source-free 
Maxwell’s equations 


V-B=0, (27.31a) 


vx E+ <0. (27.31b) 
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The remaining two Maxwell’s equations, the sourced ones, are 


V-E=47q, (27.32a) 
Vx B- a =47j , (27.32b) 


where q and 7 are the electric charge and current density, the time and space components of the electric 
4-current density 7” 


j” = {4i} - (27.33) 


The electromagnetic potentials ¢ and A are not unique, but rather are defined only up to a gauge transfor- 
mation by some arbitrary gauge field 0 


oo 
>o AE? A>A+V0O. (27.34) 
The gauge transformation (27.34) evidently leaves the electric and magnetic fields E and B, equations (27.30), 
invariant. 


Following the path of previous sections, §27.1 and thereafter, decompose the vector potential A into its 
scalar and vector parts 


A= VA, + A, , (27.35) 


scalar vector 


in which the vector part by definition satisfies the transversality condition V-A, = 0. Under a gauge 
transformation (27.34), the potentials transform as 


30 


Aj > Aj +0, (27.36b) 
A, => AL P (27.36c) 


Eliminating the gauge field @ yields 3 gauge-invariant potentials, comprising 1 scalar ®, and 1 vector A, 
containing 2 degrees of freedom: 


Sg l (27.37a) 
scalar ~ Ot |’ se 
A, |. (27.37b) 
vector 


This shows that the electromagnetic field contains 3 independent degrees of freedom, consisting of 1 scalar 
and 1 vector. 


Concept question 27.2. Are gauge-invariant potentials Lorentz-invariant? The potentials ® and 
A, equations (27.37), are by construction gauge-invariant, but is this construction Lorentz-invariant? Do 
® and A, constitute the components of a 4-vector? Answer. No. 
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In terms of the gauge-invariant potentials ® and A], equations (27.37), the electric and magnetic fields 


are 
OAL 
E = -V —- —— 27.38 
z (27.38a) 
B=VxA,. (27.38b) 
The sourced Maxwell’s equations (27.32) thus become, in terms of ® and A_, 
—V°O = Ang , (27.39a) 
scalar scalar 
Vö —OA, =4rV ij) +4751 , (27.39b) 


Jar 
scala vector scalar vector 


where Vj), and j1 are the scalar and vector parts of the current density j. Equations (27.39) bear a striking 
similarity to the Einstein equations (27.16). Only the vector part A, satisfies a wave equation, 


A, = Arg, 5 (27.40) 


while the scalar part ® satisfies an instantaneous equation (27.39a), Vo = 4T Vj, that seemingly vio- 
lates causality. And just as Einstein’s equations (27.16) enforce conservation of energy-momentum, so also 
Maxwell’s equations (27.39) enforce conservation of electric charge, 


oq 
— +V.j=0 27.41 
ah a ( ) 
or in 4-dimensional form 
Vinj” =0. (27.42) 


The fact that only the vector part A] satisfies a wave equation (27.40) reflects physically the fact that 
electromagnetic waves are transverse, and they contain only two propagating degrees of freedom, the vector, 


or spin +1, components. 

Why do Maxwell’s equations (27.39) have this structure? Although equation (27.40) appears to be a local 
wave equation for the vector part A, of the potential sourced by the vector part 7, of the current, in fact the 
wave equation is non-local because the decomposition of the potential and current into scalar and vector parts 
is non-local (it involves the solution of a Laplacian equation, eq. (26.23)). It is only the sum j = Vj +71 of 
the scalar and vector parts of the current density that is local. Therefore, the Maxwell’s equation (27.39b) 
must have a scalar part to go along with the vector part, such that the source on the right hand side, the 
current density j, is local. Given this Maxwell equation (27.39b), the Maxwell equation (27.39a) then serves 
precisely to enforce conservation of electric charge, equation (27.41). 

Just as it is possible to regard the Einstein equations (27.16a) and (27.16b) as constraint equations 
whose continued satisfaction is guaranteed by conservation of energy-momentum, so also the Maxwell equa- 
tion (27.39a) for ® can be regarded as a constraint equation whose continued satisfaction is guaranteed 
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by conservation of electric charge. For charge conservation (27.41) coupled with the spatial Maxwell equa- 
tion (27.39b) ensures that 

° (4ra + V?®) =0, (27.43) 
the solution of which, subject to the condition that 47q + V?® = 0 initially, is 4rq + V?® = 0 at all times, 
which is precisely the Maxwell equation (27.39a). 

In a system of charges and electromagnetic fields, equations of motion for the charges in the electro- 
magnetic field must be adjoined to the (Maxwell) equations of motion for the electromagnetic field. If the 
equations of motion for the charges are arranged to conserve charge, as they should, then the scalar Maxwell 
equation (27.39a) determines the scalar potential ® on the initial hypersurface of constant time, but can be 
discarded thereafter as redundant. 


Concept question 27.3. What parts of Maxwell’s equations can be discarded? Is it possible to 
discard the scalar part of the spatial Maxwell equation (27.39b), rather than the scalar equation (27.39a) for 
®? Project out the scalar part of equation (27.39b) by taking its divergence, 


V? (ri - ê) =0. (27.44) 


Argue that the Maxwell equation (27.39a), coupled with charge conservation (27.41), ensures that equa- 
tion (27.44) is true, subject to boundary condition that the current j vanish sufficiently rapidly at spatial 
infinity, in accordance with the decomposition theorem of §26.8.1. 


Since only gauge-invariant quantities have physical significance, it is legitimate to impose any condition 
on the gauge field 0. A gauge in which the potentials ¢ and A individually satisfy wave equations is Lorenz 
(not Lorentz!) gauge, which consists of the Lorentz-invariant condition 


VmnA™ =0. (27.45) 


Under a gauge transformation (27.34), the left hand side of equation (27.45) transforms as 


VmnA™ > Vn A™ +0 , (27.46) 


and the Lorenz gauge condition (27.45) can be accomplished as a particular solution of the wave equation 
for the gauge field 0. In terms of the potentials @ and A), the Lorenz gauge condition (27.45) is 


06 2 
OE +V Al =0. (27.47) 

In Lorenz gauge, Maxwell’s equations (27.39) become 
@=—47q , (27.48a) 
A= —4rj, (27.48b) 


which are manifestly wave equations for the potentials @ and A. 
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Does the fact that the potentials ¢ and A in one particular gauge, Lorenz gauge, satisfy wave equations 
necessarily guarantee that the electric and magnetic fields E and B satisfy wave equations? Yes, because it 
follows from the definitions (27.30) of E and B that if the potentials ¢ and A satisfy wave equations, then 
so also must the fields E and B themselves; but the fields E and B are gauge-invariant, so if they satisfy 
wave equations in one gauge, then they must satisfy the same wave equations in any gauge. 

In electromagnetism, the most physical choice of gauge is one in which the potentials ¢ and A coincide 
with the gauge-invariant potentials ® and A|, equations (27.37). This gauge, known as Coulomb gauge, 
is accomplished by setting 


Aj =0, (27.49) 


or equivalently 
V-A=0. (27.50) 


The gravitational analogue of this gauge is the Newtonian gauge discussed in the next section but one, §27.8. 

Does the fact that in Lorenz gauge the potentials ¢ and A propagate at the speed of light (in the absence 
of sources, j™ = 0) imply that the gauge-invariant potentials ® and A, propagate at the speed of light? 
No. The gauge-invariant potentials ® and A, , equations (27.37), are related to the Lorenz gauge potentials 
g and A by a non-local decomposition. 


27.7 Harmonic gauge 


The fact that all locally measurable gravitational perturbations do propagate causally, at the speed of light 
in the absence of sources, can be demonstrated by choosing a particular gauge, harmonic gauge, equa- 
tion (27.51), which can be considered an analogue of the Lorenz gauge of electromagnetism, equation (27.45). 
In harmonic gauge, all 10 of the tetrad gauge-variant (i.e. symmetric) combinations Ymnt+tYnm of the vierbein 
perturbations satisfy wave equations (27.56), and therefore propagate causally. This does not imply that the 
scalar, vector, and tensor components of the vierbein perturbations individually propagate causally, because 
the decomposition into scalar, vector, and tensor modes is non-local. In particular, of the coordinate and 
tetrad-gauge invariant potentials Y, ®, Wa, and ha» defined by equations (27.13), only the tensor poten- 
tial hab propagates causally. The situation is entirely analogous to that of electromagnetism, §27.6, where 
in Lorenz gauge the potentials ¢ and A propagate causally, equations (27.48), yet of the gauge-invariant 
potentials ® and A, defined by equations (27.37), only the vector potential A, propagates causally. 
Harmonic gauge is the set of 4 coordinate conditions 


V” (mn + Pam) — VnPm™ =0, (27.51) 


equivalent to the vanishing of Fock’s (1957) harmonic function (17.185). The conditions (27.51) are arranged 
in a form that is tetrad gauge-invariant (the conditions depend only on the symmetric part of Ymn). The 
quantities on the left hand side of equations (27.51) transform under a coordinate gauge transformation, in 
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accordance with (27.9), as 


The change Ue, resulting from the coordinate gauge transformation is the 4-dimensional wave operator 
acting on the coordinate shift en. Indeed, the harmonic gauge conditions (27.51) follow uniquely from the 
requirements (a) that the change produced by a coordinate gauge transformation be Oen, as suggested by 
the analogous electromagnetic transformation (27.46), and (b) that the conditions be tetrad gauge-invariant. 
The harmonic gauge conditions (27.51) can be accomplished as a particular solution of the wave equation for 
the coordinate shift €,. In terms of the potentials defined by equations (27.6) and (27.13), the 4 harmonic 
gauge conditions (27.51) are 


U+36-Olw+mu-—h)=0, (27.53a) 
Wa — O(ha the) =0, (27.53b) 
—V+6-Oh=0, (27.53c) 
or equivalently 
(w+) =46, (27.54a) 
(Ra + ha) = Wie, (27.54b) 
h=-W+O. (27.54c) 


Substituting equations (27.54) into the Einstein tensor Gmn, equation (27.16), leads, after some calculation, 
to the result that in harmonic gauge, 


or equivalently 


where Rmn is the Ricci tensor. Equation (27.56) shows that in harmonic gauge, all tetrad gauge-invariant 
(i.e. symmetric) combinations Ymn+Ynm of the vierbein potentials propagate causally, at the speed of light 
in vacuo, Rmn = 0. Although the result (27.56) is true only in a particular gauge, harmonic gauge, it follows 
that all quantities that are (coordinate and tetrad) gauge-invariant, and that can be constructed from the 
vierbein potentials Ymn and their derivatives (and are therefore local), must also propagate at the speed of 
light. 

The 4 coordinate gauge conditions (27.51) still leave 6 tetrad gauge conditions to be chosen at will. A 
natural choice, in the sense that it leads to the greatest simplification of the tetrad connections Ikmn, 
equations (27.15), is the 6 tetrad gauge conditions 


Ü = Üa =h=h,=0. (27.57) 
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Exercise 27.4. Einstein tensor in harmonic gauge. Confirm equation (27.56). 


27.8 Newtonian (Copernican) gauge 


If the unperturbed background is Minkowski space, then the most physical gauge is one in which the 6 
perturbations retained coincide with the 6 coordinate and tetrad gauge-invariant perturbations (27.13). This 
gauge is called Newtonian gauge. Because in Newtonian gauge the perturbations are precisely the physical 
perturbations, if the perturbations are physically weak (small), then the perturbations in Newtonian gauge 
will necessarily be small. 

I think Newtonian gauge should be called Copernican gauge. Even though the solar system is a highly 
non-linear system, from the perspective of general relativity it is a weakly perturbed gravitating system. 
Applied to the solar system, Newtonian gauge effectively keeps the coordinates aligned with the classical 
Sun-centred Copernican coordinate frame. By contrast, the coordinates of synchronous gauge (§27.9), which 
are chosen to follow freely-falling bodies, would quickly collapse or get wound up by orbital motions if applied 
to the solar system, and would cease to provide a useful description. 

Newtonian (Copernican) gauge sets 


w= 0 = Üa =h=h=he=ha=0, (27.58) 


so that the retained perturbations are the 6 coordinate and tetrad gauge-invariant perturbations (27.13) 


vw =y, (27.59a) 
scalar 
ð, (27.59b) 
scalar 
Wa = Wa, (27.59c) 
vector 
hab « (27.59d) 
tensor 


In matrix form, the vierbein perturbation in Newtonian gauge, in a frame where the wavevector k is along 
the z-direction, are, from equation (27.7), 


W, W, 0 
Bthez hey 0 
hey -ha 0 

0 0 © 


(27.60) 


Ymn = 


SOS: a 


The Newtonian line-element is, in a form that keeps the Newtonian tetrad manifest, 


ds? = — [(1+ W) dt]? + das[(1 — ®)dx* — hdx® — W“dt] [(1 — )da® — hèdx? — Wat] , (27.61) 
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which reduces to the Newtonian metric 
ds? = — (1 +2 Y) dt? — 2 Wa dtdx® + [ôa (1 — 2) — 2 hg] dx“dz’? . (27.62) 


Since scalar, vector, and tensor perturbations evolve independently, it is legitimate to consider each in 
isolation. For example, if one is interested only in scalar perturbations, then it is fine to keep only the 
scalar potentials Ų and ® non-zero. Furthermore, as discussed in §27.12, since the difference Ų — ® in 
scalar potentials is sourced by anisotropic relativistic pressure, which is typically small, it is often a good 
approximation to set Ų = ®. 

The tetrad-frame 4-velocity of a person at rest in the tetrad frame is by definition u” = dx™/dr = 
{1,0,0,0}, and the corresponding coordinate 4-velocity u” is, in Newtonian gauge, 


ut = eo” = {1 — Y, Wa} . (27.63) 


This shows that W, can be interpreted as a 3-velocity at which the tetrad frame is moving through the 
coordinates. This is the “dragging of inertial frames” discussed in §27.11. The proper acceleration experienced 
by a person at rest in the tetrad frame, with tetrad 4-velocity u™ = {1,0,0,0}, is 
Du! 
Dr 
This shows that the “gravity,” or minus the proper acceleration, experienced by a person at rest in the tetrad 
frame is minus the gradient of the potential Y. 


= u? Dou? = u? (Oou! +T’) = Tio = Vall . (27.64) 


Concept question 27.5. Independent evolution of scalar, vector, and tensor modes. If the decom- 
position into scalar, vector, and tensor modes is non-local, how can it be legitimate to consider the evolution 
of the modes in isolation from each other? 


27.9 Synchronous gauge 


One of the earliest gauges used in general relativistic perturbation theory, and still (in its conformal version) 
widely used in cosmology, is synchronous gauge. As will be seen below, equations (27.71) and (27.72), 
synchronous gauge effectively chooses a coordinate system and tetrad that is attached to the locally inertial 
frames of freely falling observers. This is fine as long as the observers move only slightly from their initial 
positions, but the coordinate system will fail when the system evolves too far, even if, as in the solar system, 
the gravitational perturbations remain weak and therefore treatable in principle with perturbation theory. 

Synchronous gauge sets the time components Ymn with m = 0 or n = 0 of the vierbein perturbations to 
Zero 


p = w = Ù = Wa = Üa = 0, (27.65) 


and makes the additional tetrad gauge choices 


h=ħ=0, (27.66) 
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with the result that the retained perturbations are the spatial perturbations 


©, h, ha, ha. (27.67) 


scalar scalar vector tensor 


In terms of these spatial perturbations, the gauge-invariant perturbations (27.13) are 


v =h, (27.68a) 
scalar 
® , (27.68b) 
scalar 
Wa = —ha, (27.68c) 
vector 
hab - (27.68d) 
tensor 


The synchronous line-element is, in a form that keeps the synchronous tetrad manifest, 
ds? = — dt? + day[(1— ®)dx® —(V-V°h +V ch + ht )da‘] [(1- ®)dx® — (VaV?h+Vah® +h?)dx?] , (27.69) 


which reduces to the synchronous metric 


ds? = — dt? + [(1 — 2)dqp — 2VaVoh— Vahs — Voha — 2 hab] dx%dz? . (27.70) 
In synchronous gauge, a person at rest in the tetrad frame has coordinate 4-velocity 
u” = eo" = {1,0,0,0} , (27.71) 


so that the tetrad rest frame coincides with the coordinate rest frame, and proper time in the rest frame 
coincides with coordinate time, 7 = t. Moreover a person at rest in the tetrad frame is freely falling, which 
follows from the fact that the acceleration experienced by a person at rest in the tetrad frame is zero, 


Dut 
Dr 


= u (Ogu +T) =T =0, (27.72) 


in which pu" = 0 because the 4-velocity at rest in the tetrad frame is constant, u = {1,0,0,0}, and 
Tõo = 0 from equations (27.15a) with the synchronous gauge choices (27.65) and (27.66). However, the 
freely falling person’s locally inertial frame is rotated relative to the tetrad frame. The cumulative rotation 
is described by a rotor R = e~9/? generated by a bivector 0 = toa" Avy? (the factor of i would disappear 
if the sum were over distinct pairs ab of antisymmetric indices) that is the integral of the tetrad connection 
To = sl aboy" A qè over time, as follows from 09a = -4 (Co, a] for any multivector a, equation (15.15). From 
equations (27.15c) and (27.68c) for Pabo, the bivector Aap is 


bab = frw dr= $(Vaho a Voha) , (27.73) 


which is the curl of the vector potential ha- 


27.10 Newtonian potential TAT 
27.10 Newtonian potential 


The next few sections examine the physical meaning of each of the gauge-invariant potentials Y, 6, Wa, and 
hap by looking at the potentials at large distances produced by a finite body containing energy-momentum, 
such as the Sun. 

Einstein’s equations Gmn = 8nTmn applied to the time-time component Goo of the Einstein tensor, 
equation (27.16a), imply Poisson’s equation 


VV? = 4rp , (27.74) 
where p is the mass-energy density 
The solution of Poisson’s equation (27.74) is 
pla’) da’ 


Consider a finite body, for example the Sun, whose energy-momentum is confined within a certain region. 
Define the mass M of the body to be the integral of the mass-energy density p, 


M= J plx) Èx . (27.77) 


Equation (27.77) agrees with what the definition of the mass M would be in the non-relativistic limit, and 
as seen below, equation (27.80), it is what a distant observer would infer the mass of the body to be based 
on its gravitational potential ® far away. Thus equation (27.77) can be taken as the definition of the mass 
of the body even when the energy-momentum is relativistic. Choose the origin of the coordinates to be at 
the centre of mass, meaning that 


EZALE =i (27.78) 


Consider the potential ® at a point x far outside the body. Expand the denominator of the integral on the 
right hand side of equation (27.76) as a Taylor series in 1/x where x = |x| 


1 Lf ne 1 êm 
->( ) Pee) = 2+ st. (27.79) 


|x’ — x| = x 


x 
M 
=== O(a~3) . (27.80) 
Equation (27.80) shows that the potential far from a body goes as ® = —M/z, reproducing the usual 


Newtonian formula. 
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27.11 Dragging of inertial frames 


In Newtonian gauge, the vector potential W = W, is the velocity at which the locally inertial tetrad frame 
moves through the coordinates, equation (27.63). This is called the dragging of inertial frames. As shown 
below, a body of angular momentum L drags frames around it with an angular velocity that goes to 2L/x? 
at large distances zx. 

Einstein’s equations applied to the vector part of the time-space component Goa of the Einstein tensor, 
equation (27.16b), imply 


V2W =-l6rf , (27.81) 


where W = W, is the gauge-invariant vector potential, and f is the vector part of the energy flux T°? 


fE haJ aT” = Tja: (27.82) 
vector vector 
The solution of equation (27.81) is 
(x’) dx’ 
=4 | =. 27. 
W(x) tae (27.83) 


As in the previous section, §27.10, consider a finite body, such as the Sun, whose energy-momentum is 
confined within a certain region. Work in the rest frame of the body, defined to be the frame where the 
energy flux f integrated over the body is zero, 


T f(a’) Pr =0. (27.84) 
Define the angular momentum L of the body to be 
L= Je x f(x) dr’ . (27.85) 


Equation (27.85) agrees with what the definition of angular momentum would be in the non-relativistic limit, 
where the mass-energy flux of a mass density p moving at velocity v is f = pv. As will be seen below, the 
angular momentum (27.85) is what a distant observer would infer the angular momentum of the body to be 
based on the potential W far away, and equation (27.85) can be taken to be the definition of the angular 
momentum of the body even when the energy-momentum is relativistic. As will be proven momentarily, 
equation (27.86), the integral f z!, f(x’) dx’ is antisymmetric in ab. To show this, write fy = Ebced VcQa for 
some potential ġa, which is valid because fẹ is the vector (curl) part of the energy flux. Then 


Jenei = J ETa) ‘= - J eeabale Via, da! = J eati Ba (27.86) 


where the third expression follows from the second by integration by parts, the surface term vanishing 
because of the assumption that the energy-momentum of the body is confined within a certain region. 
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Taylor expanding equation (27.83) using equation (27.79) gives 


JEG rti [(e-e wx’) dx + O(a~*) 
a : f(x'))x'] dx + O(x~™) 


= E Lx#+O(a7*), (27.87) 


X Sa 


W (zx) 


where the first integral on the right hand side of the first line of equation (27.87) vanishes because the frame 
is the rest frame of the body, equation (27.84), and the second integral on the right hand side of the first line 
equals the first integral on the second line thanks to the antisymmetry of f x’ f(x')d?x, equation (27.86). 
The vector potential W = W, points in the direction of rotation, right-handedly about the axis of angular 
momentum L. Equation (27.87) says that a body of angular momentum L drags frames around it at angular 
velocity Q at large distances x 


W=Q2x2, Q=. (27.88) 


Exercise 27.6. Gravity Probe B and the geodetic and frame-dragging precession of gyroscopes. 
The purpose of Gravity Probe B was to measure the predicted general relativistic precession of a gyroscope 
in the gravitational field of the Earth. Consider a gyroscope that is in free fall in a spacecraft in orbit around 
the Earth. In the gyro rest frame, the spin 4-vector 0” of the gyro has only spatial components 


™ — {0g}. (27.89) 


If the gyroscope is moving at 4-velocity u™ relative to the tetrad (Earth) frame, then the components s™ of 
the spin vector in the tetrad frame are related to those o™ in the gyro frame by a Lorentz boost at 4-velocity 
—u™ (early alphabet indices a, b, ... signify spatial components): 


b,,a 
ä Opu U 
= : 27.90 
{5°,s*} = {amu oo I ( ) 
Conversely, the components o” of the spin vector in the gyro frame are related to those s” in the tetrad 
frame by 
0,,a 
a 2 su 
e r 27.91 
ov =s TEET ( ) 


The gyro is in free-fall in orbit about the earth, so its 4-velocity u™ and 4-spin s™ satisfy the geodesic 
equations of motion 


—— +e uu =0, ——+I%*,,s™u™=0. (27.92) 
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1. Spin equation. Show that 


c 0 


—— = 0 — Tape" = Tabou? + EI (Toacu? = Docu”) + 0 (Toaow? = Tos0u") š (27.93) 


u 

1l+u 
[Hint: The first step is to convert o° to s* and u*, using equation (27.91). Then apply the geodesic 
equations (27.92). Then convert s! back to o“ using equation (27.90).| 

2. Spin precession. Gravitational fields in the solar system are weak, so perturbation theory in Minkowski 
space is valid. The tetrad connections Ty, in Newtonian gauge are, from equations (27.15), 


Toa0 = —VaW , (27.94a 
Toa = San — t(Va Ws + Vo Wa) , (27.94b 
Tabo = i (VaWo — VWa) , (27.94c 
Tabe = (ObeVa — Sac Wb) ® + Vahve — Vohac - (27.94d 
Show from equation (27.93) that the spin ø = o° of a freely-falling gyroscope moving at 3-velocity 
v = u/u® in a weak gravitational field evolves as (the proper time derivative d/dr in equation (27.93 


can be converted to the coordinate time derivative d/dt by dividing by u? = dt/dr) 
u? é 
1+ u? 


d 
T = ax [vx VO+ 


1 v 
Vý- -VxW 
vx 5 x t3 


a — vf h| . (27. 
pany (VWe+ VeW) -0° V x he] . (27.95) 


where the vector of vectors he is shorthand for the tensor potential, he = hac. Conclude that at non- 
relativistic velocities, |u| < u? ~ 1, and for V = © and has = 0, equation (27.95) reduces to 


d 3 1 
Fa 0x (Sux Ve-tvxW) (27.96) 
By comparing your equation (27.96) to the equation of motion of a 3-vector rotating at angular velocity 
W, 
d 
T =wxo, (27.97) 


deduce the angular velocity w with which the spin s precesses. The term depending on ® is the geodetic, 
or de Sitter (de Sitter, 1916), precession, while the term depending on W is the frame-dragging, or 
Lense-Thirring (Thirring, 1918; Lense and Thirring, 1918), precession. [Hint: Recall the 3-vector formula 
a x (bx c) = (a-c)b— (a - b)c. If the object is non-relativistic, then |u| < u? ~ 1] 

3. Angular velocities. A body of mass M and angular momentum L produces scalar and vector pertur- 
bations ® and W at spatial position x of, equations (27.80) and (27.87), 


(x)=-—, W(x) = z Lxé. (27.98) 


Show that for a circular orbit right-handed about direction n, so that v = u(n x &), the geodetic/de 
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Sitter precession is, with units restored, 


3(GM)3/? 
was = Safa (27.99) 
while the frame-dragging/Lense-Thirring precession is 
G seh 
wir = =~ |- L +3ê2(ĉ2.- L)]. (27.100) 


cer? 


[Hint: You will need to use the relation between velocity v and potential ® in a circular orbit.| 

4. Orbit. What is the orbit-averaged angular velocity for frame-dragging precession in the cases of (i) an 
equatorial circular orbit, (ii) a polar circular orbit? Compare the directions of the geodetic and frame- 
dragging precessions in the two cases. Gravity Probe B occupied a polar orbit. Why was that a good 
strategy? 

5. Gravity Probe B. Estimate the angular velocity of the geodetic and frame-dragging precessions for 
Gravity Probe B. Express your answer in arcseconds per year. [Hint: The GPB fact sheet at https:// 
einstein.stanford.edu/content /fact_ sheet /GPB_ FactSheet-0405.pdf gives the semi-major axis of GPB’s 
orbit as 7027.4km. The IAU 2009 system of astronomical constants (Luzum et al., 2009) gives GM = 
3.9860044 x 10/4 më s7? for the Earth. The Earth fact sheet at https://nssdc.gsfc.nasa.gov/planetary / 
factsheet /earthfact.html gives needed information about the Earth, including its moment of inertia.] 

6. Quadrupole precession. There is also a purely Newtonian precession that is produced by plain old 
Newtonian gravity on an object with a quadrupole moment. If you wanted to test the geodetic and frame- 
dragging effects with a gyroscope in orbit around the Earth, what would you do to avoid contamination 
by Newtonian quadrupole precession? 


27.12 Quadrupole pressure 
Einstein’s equations applied to the part of the Einstein tensor (27.16c) involving UV — ® imply 
V? (Y — ©) = —87QaeTas , (27.101) 
where Qas is the quadrupole operator (an integro-differential operator) defined by 
Qab = $ VaVo V? — $ San ọ (27.102) 
with V~? the inverse spatial Laplacian operator. In Fourier space, the quadrupole operator is 
Qas = 3 kako — 4 ban - (27.103) 


The quadrupole operator Qas yields zero when acting on dq, (that is, Qa» is traceless), and the Laplacian 
operator V? when acting on VaVe 


Qabab =0 ; QabVaVo = y? . (27.104) 
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The solution of equation (27.101) is 


3 (£a — £4 )(£e— x4) 1 Tabl) a 
y- = 7 Oa ; 27.105 
fi; |z — x'|? 2 o la — av’ | ( ) 


Taylor expanding equation (27.105) using equation (27.79) yields Y — © at large distance in the z-direction 

from a finite body, 
U-$=— =f [Tra — $(Tyy +Tzz)] Ba’ + O(a) . (27.106) 

Equation (27.101) shows that the source of the difference Y — ® between the two scalar potentials is the 
quadrupole pressure. Since the quadrupole pressure is small if either there are no relativistic sources, or any 
relativistic sources are isotropic, it is often a good approximation to set Y = ®. An exception is where there 
is a significant anisotropic relativistic component. For example, the energy-momentum tensor of a static 
electric field is relativistic and anisotropic. 

One situation where the difference between WV and ® is appreciable is the case of freely-streaming photons 
(and neutrinos) at around the time of recombination in cosmology. The 2008 analysis of the CMB by the 
WMAP team claims to detect a non-zero value of Y — © from a slight shift in the third acoustic peak. 


Exercise 27.7. Scalar potentials outside a spherical body. Argue that the traceless part of the spatial 
energy-momentum tensor of a spherically symmetric distribution must take the form 


Tav(r) = (Fafo — $ Sas) (p(r) — pa (r)) , (27.107) 


where p(r) and p] (r) are the radial and transverse pressures at radius r. From equation (27.105), show that 
W — Ẹ at radial distance x from the centre of a spherically symmetric distribution is 


W(x) ~ a2) =— f npero- 


r 


(27.108) 


Notice that the integral is over r > x, that is, only energy-momentum outside radius x produces non-vanishing 
Y — ®. Show that if the only source of energy-momentum outside the body is an electric charge Q, for which 


—p =p, = Q?/r*, then 


= 2nQ? 


T(z) — (2) = = (27.109) 


27.13 Gravitational waves 


The tensor perturbations has describe propagating gravitational waves. The two independent components of 
the tensor perturbations describe two polarizations. The two components are commonly designated h} and 
hx, equations (27.19). Gravitational waves induce a quadrupole tidal oscillation transverse to the direction 
of propagation, and the subscripts + and x represent the shape of the quadrupole oscillation, as illustrated 
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Figure 27.1 The two polarizations of gravitational waves. The (top) polarization h varies as cos 2x under a right- 
handed rotation by angle x about the direction of propagation (into the paper), while the (bottom) polarization hx 
varies as — sin 2x. A gravitational wave causes a system of freely falling test masses to oscillate relative to a grid of 
points a fixed proper distance apart. 


by Figure 27.1. The h, polarization varies as cos2x under a right-handed rotation by angle x about the 
direction of propagation (the z-direction), while the hx polarization varies as — sin 2y. 

Einstein’s equations applied to the tensor component of the spatial Einstein tensor (27.16c) imply that 
gravitational waves are sourced by the tensor component of the energy-momentum 


hab = Sr Tab - (27.110) 


tensor 


The solution of the wave equation (27.110) can be obtained from the Green’s function of the d’Alembertian 
wave operator O defined by equation (27.17). The Green’s function is by definition the solution of the wave 
equation with a delta-function source. There are retarded solutions, which propagate into the future along 
the future light cone, and advanced solutions, which propagate into the past along the past light cone. 
In the present case, the solutions of interest are the retarded solutions, since these represent gravitational 
waves emitted by a source. Because of the time and space translation symmetry of the d’Alembertian in flat 
(Minkowski) space, the delta-function source of the Green’s function can without loss of generality be taken 
at the origin t = æ = 0. Thus the Green’s function F is the solution of 


F = ô (z), (27.111) 
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where 6*(x) = 6(t)6°(a) is the 4-dimensional Dirac delta-function. The solution of equation (27.111) subject 
to retarded boundary conditions is (a standard exercise in mathematics) the retarded Green’s function 
d(t — |x) Ol) 


F = 
Ara} 


; (27.112) 


where and O(t) is the Heaviside function, O(t) = 0 for t < 0 and O(t) = 1 for t > 0. The solution of the 
sourced gravitational wave equation (27.110) is thus 


Talt, 2) Ba! 


halt, £) = — 2 ce zi , (27.113) 
where t is the retarded time 
U=t—|a’-a2|, (27.114) 


which lies on the past light cone of the observer, and is the time at which the source emitted the signal. The 
solution (27.113) resembles the solution of Poisson’s equation, except that the source is evaluated along the 
past light cone of the observer. 

As in §§27.10 and 27.11, consider a finite body, whose energy-momentum is confined within a certain 
region, and which is a source of gravitational waves. The Hulse-Taylor binary pulsar, Exercise 27.9, is a fine 
example. Far from the body, the leading order contribution to the tensor potential ha» is, from the multipole 
expansion (27.79), 

halt, £) = — = [Tat 2") (27.115) 
T tensor 
The integral (27.115) is hard to solve in general, but there is a simple solution for gravitational waves 
whose wavelengths are large compared to the size of the body. To obtain this solution, first consider that 
conservation of energy-momentum implies that 


627 9 i ð OT ie ƏT. ià 
oe 7 VeVoT -5( OE +V.T )-¥. (Sapte ) =o. (27.116) 
Multiply by «%a° and integrate 
627 9 
joe ae Peca VeVal r = Jea Par = Ja da, (27.117) 


where the third expression follows from the second by a double integration by parts. For wavelengths that 
are long compared to the size of the body, the first expression of equations (27.117) is 


627 9 8? O? Tab 
[eats ae Ba x aa | Tt T” Bz = a2” (27.118) 


where I,» is the second moment of the mass 


Ia = [vote T® r. (27.119) 
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The tensor (spin 2) part of the energy-momentum is trace-free. The trace-free part Fa» of the second moment 
Iq» is the quadrupole moment of the mass distribution (this definition is conventional, but differs by a factor 
of 2/3 from what is called the quadrupole moment in spherical harmonics) 


Fan = Ian — $ Sav IS = [oars — $ fant’) T” dc . (27.120) 


Substituting the last expression of equations (27.117) into equation (27.115) gives the quadrupole formula 
for gravitational radiation at wavelengths long compared to the size of the emitting body 


halt, £) = — = Falt- x)|. (27.121) 


x tensor 


Equation (27.121) is valid for long wavelength modes observed at distances x far from the source of grav- 
itational radiation. The right hand side is evaluated at retarded time t — x: the observer is looking at the 
source as it used to be at time t — x. 

If the gravitational wave is moving in the z-direction, then the tensor components of the quadrupole 
moment fa» are 


T, = (Use — Iy), Fx = 4 lay + Iye) - (27.122) 


Concept question 27.8. Units of the gravitational quadrupole radiation formula. Restore units 
to the quadrupole formula (27.121) for gravitational radiation. Answer: 
G. 


hab(t, x) = — Ze taut) . (27.123) 
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The gravitational wave equation (27.27) in empty space appears to describe gravitational waves propagat- 
ing in a region where the energy-momentum tensor Tmn is zero. However, gravitational waves do carry 
energy-momentum, just as do other kinds of waves, such as electromagnetic waves. The energy-momentum 
is quadratic in the tensor perturbation hab, and so vanishes to linear order. 

To determine the energy-momentum in gravitational waves, calculate the Einstein tensor Gmn to second 
order, imposing the vacuum conditions that the unperturbed and linear parts of the Einstein tensor vanish 


0 


Con = Oe, =0. (27.124) 


The parts of the second-order perturbation that depend on the tensor perturbation hap are, in a frame where 
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the wavevector k is along the z-axis, 


2 _ Say y L a 2\22 
Coo = — (hab) (h VT ae +V2)h (27.125a) 
; 1 
Gj == (hab) (Vh) + 2g ; (27.125b) 
i ee an (2 o2 
Gas = — (V zhab)(V h”) + z( aE +V2)h (27.1250) 
where 
h? = heh = (h? + h?) = 2h44h—- . (27.126) 


Since the Einstein tensor vanishes to linear order, equations (27.124), the Lie derivative of the linear order 
Einstein tensor is zero, and consequently the quadratic order expressions (27.125) are coordinate gauge- 
invariant. They are also tetrad gauge-invariant since they depend only on the (coordinate and) tetrad gauge- 
invariant perturbation hap. The rightmost set of terms on the right hand side of each of equations (27.125) are 
total derivatives (with respect to either time t or space z). These terms yield surface terms when integrated 
over a region, and tend to average to zero when integrated over a region much larger than a wavelength. On 
the other hand, the leftmost set of terms on the right hand side of each of equations (27.125) do not average 
to zero; for example, the terms for Goo and G,, are negative everywhere, being minus a sum of squares. A 
negative energy density? The interpretation is that these terms are to be taken over to the right hand side 


of the Einstein equations, and re-interpreted as the energy-momentum T®” in gravitational waves 


ife . 1/0? 
gw — aby = 2 2 
TR = = hash ) aoa +V2)h | , (27.127a) 
TSY = L (hab) (V:h®) — re Vh? (27.127b) 
0z = Sir ab z 2 at z ; - 
1 1/0? 
gw abi ff 2 2 
ma [Vaha j= z(a t V2) | . (27.127€) 


The terms involving total derivatives, although they vanish when averaged over a region larger than many 


WwW 


wavelengths, ensure that the energy-momentum T8% in gravitational waves satisfies conservation of energy- 


momentum in the flat background space 
V”TSY =0. (27.128) 


mn 


Averaged over a region larger than many wavelengths, the energy-momentum in gravitational waves is 


o1 


(Than) = gz (Vmhab)(Vnh®) |. (27.129) 


mn 
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Equation (27.129) may also be written explicitly as a sum over the two linear or circular polarizations 


1 


= = [(Vnha4)(Vnh——) + (Vnh44)(Vmh--)] - (27.130) 


Exercise 27.9. Hulse-Taylor binary. 
1. Quadrupole moment. Consider a pair of masses Mı and Mə in circular orbit, with position vectors rı 
and rə relative to their center of mass. Argue that the quadrupole moment Fap of the mass distribution 
defined by 


Fab = 5 Mx (TX,a TX,b = 3 Oab rx) (27.131) 
masses X 
is 
Fab = mr? (Fahy — ab) » (27.132) 


where r = r& = r2 — rı is the orbital separation, and m is the reduced mass 


MM. 
m= u 2 M=M +M. (27.133) 


[Hint: Assume for simplicity that the orbit is described by classical Newtonian mechanics.]| 

2. Tensor components. Suppose that the orbital plane is inclined at inclination angle ų to the line-of- 
sight. Choose the observer’s locally inertial frame so that the z-axis 2 is the line-of-sight direction from 
the center of mass of the binary to the observer, and the z-axis ĉ points in the plane of the orbit. Argue 
that the orbital separation r is 


r = r| (cos — ĝsin 1) coswt + €sinwt] (27.134) 


where w is the orbital frequency. Deduce that the tensor components of the quadrupole moment are 


Fy = E (fes — tyy) = tmr? [cos — (1 + sin”) cos 2wt] , (27.135a) 
Fy = fey = — $mr’ sin i sin 2wt . (27.135b) 


[Hint: Recall the trigonometric formulae cos” = $(1+ cos 2¢) and sin*¢ = $(1 — cos 2¢).| 
3. Tensor perturbation. Deduce the tensor perturbations h, and h, at large distance z from the orbiting 
masses from the quadrupole formula 


1. 
hab = — — Fap(t — z) . (27.136) 
z 


Notice that t— z is the retarded time: an observer at distance z is looking at the orbiting masses as they 
used to be at time t — z. 
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Energy momentum in gravitational waves. The energy-momentum T®” in gravitational waves is 


given by the quadrupole formula 
An TSY = (Vmh+)(Vnh+) + (Vinhx)(Vnhx) ; (27.137) 


where Vm = {0/0t, 0/0x*}. Show that the non-vanishing components of the gravitational wave energy- 
momentum tensor are 

mrtw® 
Irz? 


[Hint: The quadrupole formula is valid for large z, so you need keep only the leading term in powers of 


To = -TES = TSY (1 + 6sin?: + sinf: — cos* cos 4w(t — z)) . (27.138) 


. Energy flux in gravitational waves The energy loss E by gravitational waves is given by the integral 


of the energy flux over all directions (note that energy flux is T°” with raised indices, and there is a 
minus sign from T°? = —Tp,), 


. m/2 
E=- J TEY 272° cosi di . (27.139) 
—r/2 


Show that (with units of c and G restored) 


E= Z, (27.140) 


. Rate of change of orbital frequency. If the orbit of the binary is described adequately by a Keplerian 


orbit, then the orbital energy E is 


M 
"EF (27.141) 
2r 
and the radius r and angular frequency w are related by Kepler’s third law 
GM 
aut 
The orbital period P is related to the angular frequency w by 
27 
= —. 27.14 
= (27.143) 
Conclude that 
P E M)?/3y,8/8 
_ w 3 96(Gm)(GM)*/°w (27.144) 


Pw 2E 50 


the minus sign in the third expression coming from the fact that the orbit is losing energy. 


. Hulse-Taylor binary. The so-called binary pulsar PSR B1913+16 discovered by Hulse and Taylor 


(1975) consists of two neutron stars, one a pulsar, in orbit. The masses of the pulsar and its companion 
are measured from the orbital motion to be (Weisberg and Taylor, 2005) 


Mı =1.4414Mo , Mz =1.3867Mo. (27.145) 
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The orbital period is 

P = 0.322997448930 day . (27.146) 
What is the predicted general relativistic rate of change P of the period, in dimensionless units (or 
s/s, if you prefer)? [Hint: The heliocentric gravitational constant is GMo = 1.3271244 x 107° m? s7? 
according to the IAU 2009 system of astronomical constants at https: //link.springer.com/article/10. 


1007%2F810569-011-9352-4.| 
8. Eccentricity correction. Actually PSR B1913+16 has a substantial eccentricity, 


e = 0.6171338 . (27.147) 


The correct general relativistic formula including the effects of eccentricity is equation (27.144) multiplied 
by a function f(e) of the eccentricity 


P 96(Gm) (GM)2/3w8/4 


P` E5 fle), (27.148) 
with 
f(e) = (1 + Sa | a (EN (27.149) 


Compare the eccentricity-corrected predicted numerical result for P with the measured value 
P = —2.4184 x 107!” . (27.150) 


Exercise 27.10. Will you be torn apart when two black holes merge? The book “Death from the 
Skies” by Phil Plait (the Bad Astronomer) contains a Chapter “Seven ways a black hole can kill you.” One 
of the ways, says Phil, is to stand near a pair of merging black holes, and be torn apart by the tidal forces 
from the gravitational waves. Is it true? 
1. Tidal forces. For a gravitational wave propagating in the z-direction in empty space, the non-zero 
components of the Riemann tensor of the perturbed Minkowski space are 


Roxos = —Royoy = —Roxex Royzy Revex —Rzyzy = hy ; (27.151a) 
Roxoy = —Roxzy = —Royze = Reazy = hy . (27.151b) 
From the expression (27.136) for hap that you derived in Exercise 27.9, and from the equation of geodesic 
deviation 
D*5Em 
$ + Reimnd€*u'u” = 0 (27.152) 
Dr? 


deduce the tidal forces on a person moving non-relativistically. [Hint: If a person is moving non- 
relativistically, it is legitimate to take the person’s 4-velocity to be u” = {1,0,0,0}. Why?] 

2. Comment. What is your advice to Phil Plait? [Hint: What you need here is rough estimates. Consider 
both supermassive and stellar-sized black holes. To make things sensible, you should require that you, 
the observer, be (a) outside the horizon, and (b) outside the point at which the static tidal force of the 
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black hole would tear you apart even without gravitational waves. You may find it convenient to define 
the mass M, of a black hole whose tidal force at the horizon is 1 gee per metre 


1 


E (27.153) 


g 


which you figured out in Exercise 11.10.] 


Concept Questions 


. Why do the wavelengths of perturbations in cosmology expand with the Universe, whereas perturbations 


in Minkowski space do not expand? 
What does power spectrum mean? 


. Why is the power spectrum a good way to characterize the amplitude of fluctuations? 


Why is the power spectrum of fluctuations of the Cosmic Microwave Background (CMB) plotted as a 
function of harmonic number? 

What causes the acoustic peaks in the power spectrum of fluctuations of the CMB? 

Are there acoustic peaks in the power spectrum of matter (galaxies) today? 

What sets the scale of the first peak in the power spectrum of the CMB? [What sets the physical scale? 
Then what sets the angular scale?] 

The odd peaks (including the first peak) in the CMB power spectrum are compression peaks, while the 
even peaks are rarefaction peaks. Why does a rarefaction produce a peak, not a trough? 

Why is the first peak the most prominent? Why do higher peaks generally get progressively weaker? 


. The third peak is about as strong as the second peak? Why? 
. The matter power spectrum reaches a maximum at a scale that is slightly larger than the scale of the 


first baryonic acoustic peak. Why? 


. The physical density of species x at the time of recombination is proportional to Qh? where Q, is the 


ratio of the actual to critical density of species x at the present time, and h = Ho/100 km s7! Mpc”? is 


the present-day Hubble constant. Explain. 


. How does changing the baryon density Nph? affect the CMB power spectrum? 
. How does changing the non-baryonic cold dark matter density 0.h?, without changing the baryon 


density Q»h?, affect the CMB power spectrum? 


. What effects do neutrinos have on perturbations? 
. How does changing the curvature Qg affect the CMB power spectrum? 
. How does changing the dark energy Qa affect the CMB power spectrum? 
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An overview of cosmological perturbations 


Undoubtedly the preeminent application of general relativistic perturbation theory is to cosmology. Fluctu- 
ations in the temperature and polarization of the Cosmic Microwave Background (CMB) provide an obser- 
vational window on the Universe at 400,000 years old that, coupled with other astronomical observations, 
has yielded impressively precise measurements of cosmological parameters. 

The theory of cosmological perturbations is based principally on general relativistic perturbation theory 
coupled to the physics of 5 species of energy-momentum: photons, baryons, non-baryonic cold dark matter, 
neutrinos, and dark energy. 

Dark energy was not important at the time of recombination, where the CMB that we see comes from, 
but it is important today. If dark energy has a vacuum equation of state, p = —p, then dark energy does 
not cluster (vacuum energy density is a constant), but it affects the evolution of the cosmic scale factor, 
and thereby does affect the clustering of baryons and dark matter today. Moreover the evolution of the 
gravitational potential along the line of sight to the CMB does affect the observed power spectrum of the 
CMB, the so-called integrated Sachs- Wolfe effect. 

1. Inflationary initial conditions. The theory of inflation has been remarkably successful in accounting 
for many aspects of observational cosmology, even though a fundamental understanding of the inflaton 
scalar field that supposedly drove inflation is missing. The current paradigm holds that primordial fluc- 
tuations were generated by vacuum quantum fluctuations in the inflaton field at the time of inflation. 
The theory makes the generic predictions that the gravitational potentials generated by vacuum fluctu- 
ations were (a) Gaussian, (b) adiabatic (meaning that all species of mass-energy fluctuated together, 
as opposed to in opposition to each other), and (c) scale-free, or rather almost scale-free (the fact that 
inflation came to an end modifies slightly the scale-free character). The three predictions fit the observed 
power spectrum of the CMB astonishingly well. 


2. Comoving Fourier modes. The spatial homogeneity of the Friedmann-Lemaitre-Robertson- Walker 
background spacetime means that its perturbations are characterized by Fourier modes of constant co- 
moving wavevector. Each Fourier mode generated by inflation evolved independently, and its wavelength 
expanded with the Universe. 


3. Scalar, vector, tensor modes. Spatial isotropy on top of spatial homogeneity means that the pertur- 
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bations comprised independently evolving scalar, vector, and tensor modes. Scalar modes dominate the 
fluctuations of the CMB, and caused the clustering of matter today. Vector modes are usually assumed 
to vanish, because there is no mechanism to generate the rotation that sources vector modes, and the ex- 
pansion of the Universe tends to redshift away any vector modes that might have been present. Inflation 
generates gravitational waves, which then propagate essentially freely to the present time. Gravitational 
waves leave an observational imprint in the “B” (magnetic (—)‘*! parity) mode of polarization of the 
CMB, whereas scalar modes produce only an “E” (electric (—)* parity) mode of polarization. 


. Power spectrum. The primary quantity measurable from observations is the power spectrum, which 
is the variance of fluctuations of the CMB or of matter (as traced by galaxies, galaxy clusters, the 
Lyman alpha forest, peculiar velocities, weak lensing, or 21 centimetre observations at high redshift). 
The statistics of a Gaussian field are completely characterized by its mean and variance. The mean 
characterizes the unperturbed background, while the variance characterizes the fluctuations. For a 3- 
dimensional statistically homogeneous and isotropic field, the variance of Fourier modes 6, defines the 
power spectrum P(k), 


(ðkôk) = len P(k) , (28.1) 
where 1ķx is the unit matrix in the Hilbert space of Fourier modes, 
lpn = (27303 (k +k’) . (28.2) 


The “momentum-conserving” Dirac delta-function in equation (28.2) is a consequence of statistical spa- 
tial translation symmetry. Isotropy implies that the power spectrum P(k) is a function only of the 
magnitude k = |k| of the wavevector. For a statistically rotation-invariant field projected on the sky, 
such as the CMB, the variance of spherical harmonic modes Oem = 0T~m/T defines the power spectrum 
Cr, 


(O¢mOem’) = Lem em’ Ce (28.3) 


where lemem is the unit matrix in the Hilbert space of spherical harmonics (distinguish the three 
usages of ô in this paragraph: ô meaning fluctuation, 6p meaning Dirac delta-function, and 6 meaning 
Kronecker delta, as in the following equation), 


Lem jem! = be Om,—m! G (28.4) 


Again, the “angular momentum-preserving” condition (28.4) that 4 = @ and m+m'’ = 0 is a consequence 
of rotational symmetry. The same rotational symmetry implies that the power spectrum C% is a function 
only of the harmonic number £, not of the directional harmonic number m. 


. Reheating. Early Universe inflation evidently came to an end. It is presumed that the vacuum energy 
released by the decay of the inflaton field, an event called reheating, somehow efficiently produced the 
matter and radiation fields that we see today. After reheating, the Universe was dominated by relativistic 
fields, collectively called “radiation.” Reheating changed the evolution of the cosmic scale factor from 
acceleration to deceleration, but is presumed not to have generated additional fluctuations. 
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Photon-baryon fluid and the sound horizon. Photon-electron (Thomson) scattering kept photons 
and baryons tightly coupled to each other, so that they behaved like a relativistic fluid. As long as the 
radiation density exceeded the baryon density, which remained true up to near the time of recombi- 


nation, the speed of sound in the photon-baryon fluid was ,/p/p ~ E of the speed of light, §30.6. 
Fluctuations with wavelengths outside the sound horizon grew by gravity. As time went by, the sound 
horizon expanded in comoving radius, and fluctuations thereby came inside the sound horizon. Once 
inside the sound horizon, sound waves could propagate, which tended to decrease the gravitational po- 
tential. However, each individual sound wave itself continued to oscillate, its oscillation amplitude 6T/T 
relative to the background temperature T remaining approximately constant, Fig. 30.7, at least well 
before recombination, when damping is unimportant (point 11 below). The suppression of the potential 
at small scales is responsible for the turnover in the observed power spectrum of matter fluctuations 
today from large to small scales, Fig. 30.15. 


Acoustic peaks in the power spectrum. The oscillations of the photon-baryon fluid produced the 
characteristic pattern of peaks and troughs in the CMB power spectrum observed today. The same 
peaks and troughs occur in the matter power spectrum, but are much less prominent, at a level of about 
10% as opposed to the order unity oscillations observed in the CMB power spectrum. For adiabatic 
fluctuations, the amplitude of the temperature fluctuations follows a pattern ~ —cos(kn,) where ns 
is the comoving sound horizon, Fig. 30.7. The mth peak occurs at a wavenumber k where kn, % nr. 
In the observed CMB power spectrum, the relevant value of the sound horizon ns is its value 175 rec at 
recombination. Thus the wavenumber k of the first peak of the observed CMB power spectrum occurs 
where kns rec ~ m. Two competing forces cause a mode to evolve: a gravitational force that amplifies 
compression, and a restoring pressure force that counteracts compression, §32.10. When a mode enters 
the sound horizon for the first time, the compressing gravitational force beats the restoring pressure 
force, so the first thing that happens is that the mode compresses further. Consequently the first peak 
is a compression peak. This sets the subsequent pattern: odd peaks are compression peaks, while even 
peaks are rarefaction peaks. The observed temperature fluctuations of the CMB are produced by a 
combination of intrinsic temperature fluctuations, Doppler shifts, and gravitational redshifting out of 
potential wells. The Doppler shift produced by the velocity of a perturbation is 90° out of phase with the 
temperature fluctuation, and so tends to fill in the troughs in the power spectrum of the temperature 
fluctuation. This is the main reason that the observed CMB power spectrum remains above zero at all 
scales. 


Logarithmic growth of matter fluctuations. Non-baryonic cold dark matter interacts weakly except 
by gravity, and is needed to explain the observed clustering of matter in the Universe today in spite of the 
small amplitude of temperature fluctuations in the CMB. The adjective “cold” refers to the requirement 
that the dark matter became non-relativistic (p = 0) at some early time. If the dark matter is both 
non-baryonic and cold, then it did not participate in the oscillations of the photon-baryon fluid. During 
the radiation-dominated phase prior to matter-radiation equality, dark matter matter fluctuations inside 
the sound horizon grow logarithmically, Fig 30.10. The logarithmic growth translates into a logarithmic 
increase in the amplitude of matter fluctuations at small scales, and is a characteristic signature of non- 
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baryonic cold dark matter. Unfortunately this signature is not readily discernible in the power spectrum 
of matter today, because of nonlinear clustering. 


Epoch of matter-radiation equality. The density of non-relativistic matter decreases more slowly 
than the density of relativistic radiation. There came a point where the matter density equaled the 
radiation density, an epoch called matter-radiation equality, after which the matter density exceeded 
the radiation density. The observed ratio of the density of matter and radiation (CMB) today require 
that matter-radiation equality occurred at a redshift of ze ~ 3400, a factor of 3 higher in redshift 
than recombination at Ze. % 1100. After matter-radiation equality, dark matter perturbations grew 
more rapidly, linearly instead of just logarithmically with cosmic scale factor. A larger dark matter 
density causes matter-radiation equality to occur earlier. The sound horizon at matter-radiation equality 
corresponds to a scale roughly around the 2.5’th peak in the CMB power spectrum. For adiabatic 
fluctuations, the way that the temperature and gravitational perturbations interact when a mode first 
enters the sound horizon means that the temperature oscillation is 5 times larger for modes that enter the 
horizon well into the radiation-dominated epoch versus well into the matter-dominated epoch, Fig. 32.3. 
The effect enhances the amplitude of observed CMB peaks higher than 2.5 relative to those lower 
than 2.5. The observed relative strengths of the 3rd versus the 2nd peak of the CMB power spectrum 
provides a measurement of the redshift of matter-radiation equality, and direct evidence for the presence 
of non-baryonic cold dark matter. 


Sound speed. The density of baryons decreased more slowly than the density of radiation, so that at 
around recombination the baryon density was becoming comparable to the radiation density. The sound 
speed p/p depends on the ratio of pressure p, which was essentially entirely that of the photons, to the 
density p, which was produced by both photons and baryons. The sound speed consequently decreased 


below V3 , §32.4. Increasing the baryon-to-photon ratio at recombination has several observational 
effects on the acoustic peaks of the CMB power spectrum, making it a prime measurable parameter from 
the CMB. First, an increased baryon fraction increases the gravitational forcing (baryon loading), which 
enhances the compression (odd) peaks while reducing the rarefaction (even) peaks. Second, increasing 
the baryon fraction reduces the sound speed, which: (a) decreases the amplitude of the radiation velocity 
relative to the radiation density, so increasing the prominence of the peaks; and (b) reduces the oscillation 
frequency of the photon-baryon fluid, which shifts the peaks to larger scales. The reduced sound speed 
also causes an adiabatic reduction of the amplitudes of all modes by the square root of the sound speed, 
but this effect is degenerate with an overall reduction in the initial amplitudes of modes produced by 
inflation. 


Electron-photon scattering. 


Prior to recombination, photons are coupled to the baryonic plasma mainly by nonrelativistic electron- 
photon (Thomson) scattering. The finite mean free path to scattering damps oscillations of the photon- 
baryon fluid. As recombination approaches, the mean free path grows longer, and the damping becomes 
greater, Fig. 32.3. Damping by Thomson scattering is responsible for the decline in the CMB power 
spectrum at smaller scales. 
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Recombination. As the temperature cooled below about 3,000 K, electrons combined with hydrogen 
and helium nuclei into neutral atoms, Fig. 31.4. This drastically reduced the amount of photon-electron 
scattering, releasing the CMB to propagate almost freely. At the same time, the baryons were released 
from the photons. Without radiation pressure to support them, fluctuations in the baryons began to 
grow like the dark matter fluctuations. 


Neutrinos. Probably all three species of neutrino have mass less than 0.2eV and were therefore rel- 
ativistic up to and at the time of recombination, equation (10.111). Each of the 3 species of neutrino 
had an abundance comparable to that of photons, and therefore made an important contribution to the 
relativistic background and its fluctuations. Unlike photons, neutrinos streamed freely, without scatter- 
ing, Fig. 33.2. The relativistic free-streaming of neutrinos provided the main source of the quadrupole 
pressure that produces a non-vanishing difference Y — ® between the scalar potentials, Fig. 33.4. How- 
ever, the neutrino quadrupole pressure was still only ~ 10% of the neutrino monopole pressure. To the 
extent that the neutrino quadrupole pressure can be approximated as negligible, the neutrinos and their 
fluctuations can be treated the same as photons. 


CMB fluctuations. The CMB fluctuations seen on the sky today represent a projection of fluctuations 
on athin but finite shell at a redshift of about 1100, Fig. 34.1, corresponding to an age of the Universe of 
about 400,000 yr. The temperature, and the degrees of polarization in two different directions, provide 
3 independent observables at each point on the sky. The isotropy of the unperturbed radiation means 
that it is most natural to measure the fluctuations in spherical harmonics, which are the eigenmodes of 
the rotation operator. Similarly, it is natural to measure the CMB polarization in spin harmonics. 


Matter fluctuations. After recombination, perturbations in the non-baryonic and baryonic matter 
grew by gravity, essentially unaffected any longer by photon pressure, Fig. 32.3. If one or more of the 
neutrino types had a mass small enough to be relativistic but large enough to contribute appreciable 
density, then its relativistic streaming could have suppressed power in matter fluctuations at small scales, 
but observations show no evidence of such suppression, which places an upper limit of about an eV on 
the mass of the most massive neutrino. The matter power spectrum measured from the clustering of 
galaxies contains acoustic oscillations like the CMB power spectrum, but because the non-baryonic dark 
matter dominates the baryons, the oscillations are much smaller. 


Integrated Sachs-Wolfe effect. Variations in the gravitational potential along the line of sight to 
the CMB affect the CMB power spectrum at large scales. This is called the integrated Sachs-Wolfe 
(ISW) effect, §34.2.2. If matter dominates the background, then the gravitational potential ® has the 
property that it remains constant in time for linear fluctuations, and there is no ISW effect. In practice, 
ISW effects are produced by at least three distinct causes. First, an early-time ISW effect is produced 
by the fact that the Universe at recombination still has an appreciable component of radiation, and is 
not yet wholly matter-dominated. Second, a late-time ISW effect is produced either by curvature or 
by a cosmological constant. Third, a non-linear ISW effect is produced by non-linear evolution of the 
potential. 
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Cosmological perturbations in a flat FLRW 
background 


For simplicity, this book considers only a flat (not closed or open) Friedmann-Lemaitre-Robertson- Walker 
(FLRW) background. The comoving Hubble distance at recombination was much smaller than today, and 
consequently the cosmological density Q was much closer to 1 at recombination than it is today. Since 
observations indicate that the Universe today is within 1% of being spatially flat (Aghanim et al., 2018), it 
is an excellent approximation to treat the Universe at the time of recombination as being spatially flat. 

With some modifications arising from cosmological expansion, perturbation theory on a flat FLRW back- 
ground is quite similar to perturbation theory in flat (Minkowski) space, Chapter 27. 

The strategy is to start in a completely general gauge, and to discover how the conformal Newtonian 
(Copernican) gauge, which is used in subsequent Chapters, emerges naturally as that gauge in which the 
perturbations are precisely the physical perturbations. 


29.1 Unperturbed line-element 


2 2°} = {n, x,y,z} to consist of conformal 


It. is convenient to choose the coordinate system a = {x°,x!,a 
time 7 together with comoving Cartesian coordinates x = £x% = {x,y,z}. The coordinate metric of the 


unperturbed background flat FLRW geometry is then 
ds? = a(n)? (- dn? + da? + dy? + dz?) ; (29.1) 


where a(n) is the cosmic scale factor. The unperturbed coordinate metric is thus the conformal Minkowski 
metric 


Juv =)” as (29.2) 


The tetrad is taken to be orthonormal, with the unperturbed tetrad axes Ym = {-Yo,‘Y1, Y2, Y3} being aligned 
with the unperturbed coordinate axes È, = {ê0, 1, ê2,€3} so that the unperturbed vierbein and inverse 
vierbein are respectively a and 1/a times the unit matrix, 


1 
em =a, m” =h. (29.3) 
a 
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Let Va denote spatial derivatives with respect to comoving spatial coordinates, 


o o 
Va = 8 — = —, 29.4 
"Oxo Oxel ae) 
which should be distinguished from the directed derivatives 0g = ea” 0/Ox" ~ (1/a)0/0x%. Because the 
background FLRW geometry is spatially homogeneous, comoving spatial gradients Va are of first order, 
and can be treated as spatial vectors whose tetrad-frame components can be raised and lowered with the 


Euclidean metric. Further, let overdot denote partial differentiation with respect to conformal time n, 


overdot = = : (29.5) 


so that for example å = da/dyn. The Hubble parameter H in the unperturbed background is 
H=—. (29.6) 


a2 


29.2 Comoving Fourier modes 


Since the unperturbed Friedmann-Lemaitre-Robertson- Walker spacetime is spatially homogeneous and iso- 
tropic, it is natural to work in comoving Fourier modes. Comoving Fourier modes have the key property that 
they evolve independently of each other, as long as perturbations remain linear. Equations in Fourier space 
are obtained by replacing the comoving spatial gradient Va by —i times the comoving wavevector ka (the 
choice of sign is the standard convention in cosmology) 


Va > —ika - (29.7) 


By this means, the spatial derivatives become algebraic, so that the partial differential equations governing 
the evolution of perturbations become ordinary differential equations. 

In what follows, the comoving spatial gradient Va will be used interchangeably with —ika, whichever is 
most convenient. 


29.3 Classification of vierbein perturbations 


The definition (26.1) of the vierbein perturbations Ymn implies that the perturbed inverse vierbein in the 
perturbed FLRW spacetime is 


1 
Em" = z (Öm + Pm" )on = (mn + Pma)” , (29.8) 


while the perturbed vierbein is 


e”, = a(o,” = Yr”) On — (7 ne penyu . (29.9) 
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The covariant tetrad-frame components Ymn of the vierbein perturbation of the FLRW geometry decom- 
pose in much the same way as in flat Minkowski case into 6 scalars, 4 vectors, and 1 tensor, a total of 
6+4x2+1x 2 = 16 degrees of freedom (the following equations are essentially the same as those (27.6) 
for the flat Minkowski background), 


yo= ~ , (29.10a 
scalar 

Poa = Vawt Wa , (29.10b 
scalar vector 

Pad = Vaw A Wa ; (29.10c 
scalar vector 

Pab = bab O + VaVoh + Eate Veh + Vals + Voha + hab - (29.10d 
scalar scalar scalar vector vector tensor 


The 4 covariant tetrad-frame components €m of the coordinate shift of the coordinate gauge transforma- 
tion (26.9) similarly decompose into 2 scalars and 1 vector (2 degrees of freedom) (the following equation is 
essentially the same as that (27.8) for the flat Minkowski background), 


Em ={ € , Vae+ €a }. (29.11) 


scalar scalar vector 


The vierbein perturbations Ymn transform under a coordinate gauge transformation (26.9) as, equa- 
tion (26.20), 


1 
with vanishing contribution from the unperturbed tetrad-frame connection, equation (29.23), since the lat- 


ter is symmetric whereas equation (26.20) depends on an antisymmetric combination of connections. The 
individual components of the vierbein perturbations transform under a coordinate gauge transformation as 


1 co 


po > Y+- 4 (29.13a 
a On 
scalar 
1/0 i 1/0 à 
Poa > Va w+-(— -“)e + we +—( - “Jeg , (29.13b 
a\On a a\On a 
scalar vector 
1 
Pao > Va (« + Leo) + Wa , (29.13¢ 
a vector 
scalar 
i 1 > 1 R 
Pab — Sab (o — Sa) + VaVo (r + Le) + Eade Veh + Va (1 + Le) + Vohat had , (29.13d 
a a scalar a vector tensor 


scalar scalar vector 
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or equivalently 
1 Eo 


ite ey, (29.14a 
a On 
1/0 å 1/0 å 

wowt ( SNe, Wa > Wa + ( “Yea, (29.14b 
a\On a a\On a 
I 

Ü — Ü+, Üa Üa, (29.14c 
a 
| 1 ~ 2 1 is s 

padt wehi es hha Mehet a athar a ia (29.14d 
a a a 


Eliminating the coordinate shift €m from the transformations (29.14) yields 12 coordinate gauge-invariant 
combinations of the perturbations, 


ð a\ . : ‘ = a. as = 
v-( +" )e, w-h, wWa-ha, Wa , b+-W, hy» he 5 Pep (29.15) 
n a scalar vector vector a scalar vector tensor 
scalar scalar 


Six combinations of these coordinate gauge-invariant perturbations depend only on the symmetric part 
Ymn + Ynm Of the vierbein perturbations, and are therefore tetrad gauge-invariant as well as coordinate 
gauge-invariant. These 6 coordinate and tetrad gauge-invariant perturbations comprise 2 scalars, 1 vector, 
and 1 tensor 


v = v- (24 2\w+o-h) (29.16a 
scalar On a 

© = 6+ “(wtw-A)|, (29.16b 
scalar a 

W, = WatWa—ha—hal, (29.16¢ 
vector 

hab |. (29.16d 
tensor 


The coordinate and tetrad gauge-invariant perturbations (29.16) reduce to those (27.13) in Minkowski space 
when the cosmic scale factor does not change, å = 0. 


29.4 Residual global gauge freedoms 


There are residual global gauge freedoms associated with (a) uncertainty in the cosmic scale factor a(7) in 
the background FLRW geometry, and (b) addition of spatially uniform but time-dependent contributions 
to vierbein components that are spatial gradients in equations (29.13), namely w, Ù, h, h, ha, and ha. 
The freedoms are global in the sense that they are spatially uniform functions of time 7. The global gauge 
freedoms mean that the scalar and vector perturbations Y, ®, and W, are gauge-invariant only up to the 
addition of spatially uniform functions of time. The tensor perturbation hap remains fully gauge-invariant. 
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To illustrate the global gauge freedoms, consider the line-element 
ds? = a(n)? {— [1 + W(n)]? dn? + [1 — ©(n))? bap de*da® , (29.17) 


in which W(7) and ®(7) are functions only of conformal time 7. A rescaling of the cosmic scale factor a, 
together with a coordinate transformation of conformal time 7, 


aa’ =a(1—®), (29.18a) 
ee 
dn > dr! = (5) dn , (29.18b) 


brings the line-element (29.17) to FLRW form, 
ds? = a'(n!)? (— dn? + dap dx*dz’) . (29.19) 


The rescaling (29.18a) of the cosmic scale factor a is distinct from any coordinate transformation, and consti- 
tutes an additional global gauge freedom over and above the coordinate and tetrad gauge freedoms discussed 
in §29.3. The transformation (29.18b) of the time coordinate is allowed because Y and ® are functions only 
of time. The argument in §29.3 that Y and ® are gauge-invariant is spoiled because in the particular case 
that the time coordinate shift €o is a function only of time 7, the change in the perturbation w is decoupled 
from the change in €9, because w and €o appear only inside a spatial gradient in the transformation (29.13c). 
The freedom to adjust w by an amount depending only on time propagates into a freedom to adjust Y and 
®, equations (29.16a) and (29.16b). More generally, the combination w + w — h upon which both Y and ® 
depend can be adjusted by adjusting any of w, w, or h by an amount depending only on time, since all these 
perturbations appear inside spatial gradients in equations (29.13). Similarly, the vector perturbation Wa, 
equation (29.16c), can be adjusted by an amount depending only on time by adjusting either of ha or ħa. 

Physically, the residual global gauge freedom in the scalar perturbations Y and © reflects the impossibility 
of distinguishing a perturbation of the mean from the mean. Any perturbation of the mean can be absorbed 
into an adjustment of the parameters of the unperturbed background. 

To what does the residual global gauge freedom in the vector perturbation W, correspond? Physically, 
Wa represents the velocity of dragging of the tetrad frame through the coordinates. A spatially uniform Wa 
corresponds to a uniform velocity of the entire Universe, which is observationally undetectable. 

Modes whose wavelengths are larger than the horizon size of an observer look spatially uniform to the 
observer. The observer cannot distinguish such modes from a change in the parameters of the background 
FLRW geometry. Thus an observer cannot measure the amplitudes Y, ®, W,, or hap of modes outside their 
horizon. 

Of course, an observer can measure modes that were outside the horizon of an earlier observer. For example, 
astronomers on Earth today can and do measure in both the CMB and in galaxy clustering “superhorizon” 
modes that were outside the horizon of an observer at the time of recombination. 

The residual global gauge freedoms mean that the intrinsic monopole mode of the observed CMB is un- 
measurable, being indistinguishable from a rescaling of the temperature of the FLRW background. Moreover 
the intrinsic CMB dipole is unmeasurable, being indistinguishable from an adjustment of the rest frame of 
the FLRW background. 
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In the remainder of this book, the perturbations Y, ©, Wa, and ha» will be referred to as gauge-invariant 
on the understanding that this refers to modes that are measurable by (within the horizon of) the observer. 


Concept question 29.1. Global curvature as a perturbation? The usual FLRW metric contains a 
curvature constant « in addition to a cosmic scale factor a. Can curvature «, if small, be treated as a 
perturbation to a flat FLRW geometry, and if so, how? Does the curvature perturbation represent a residual 
global gauge freedom? Answer. Yes, x, if small, can be treated as a perturbation. The isotropic (Poincaré) 
form of the FLRW line-element, equation (10.26), takes the form 


ds? = a(n)? ( dn? + 5 Sab ax*de" (29.20) 


1+ įk 


where x? = 5°, x2 is the square of the comoving radial distance from the origin. If the curvature scale is much 
smaller that the horizon distance, y |«| 7 <1, then the curvature looks like a perturbation proportional to 
the square of the comoving distance, 


kr. (29.21) 


Is this a residual global gauge freedom? Equation (29.21) states that only the sum aha? — Ọ is gauge- 
invariant, so yes there is a residual global gauge freedom associated with the ambiguity between « and ©. 
In pre-1998 days when astronomers were measuring Qm œ~ 0.3 and only the reckless contemplated non-zero 
Qa, it was necessary to consider that Nature might have chosen a substantial curvature Q% ~ 0.7, in which 
case k was decidedly non-zero (and negative), certainly not a perturbation. Post dark-energy, observations 
are stubbornly consistent with zero curvature. Occam’s razor would then prefer the simpler of two models 


that fit the data, a flat background geometry «K = 0. 


Concept question 29.2. Can the Universe at large rotate? Is it possible for a Universe to rotate 
globally? What would be the observable signature, if any? Answer. Yes, the Universe could rotate globally. 
Gauge-invariant rotational modes are described by the gauge-invariant vector gravitational potential W+. 


A non-vanishing vector gravitational potential would drive non-vanishing unpolarized and polarized vector 
photon fluctuations Og 41 with £ > 1 and 20¢ +4, with £ > 2. Unfortunately there is no clean observational 
signal of such modes, because the observed CMB on the sky mixes scalar, vector, and tensor modes with 


the same £ (this is the sum over m in equation (36.31)). Vector modes are expected to be overwhelmed by 
scalar modes in the unpolarized and E-mode polarized CMB, and by tensor modes in the B-mode polarized 
CMB. The reason for the dominance of scalar and tensor over vector modes is that whereas scalar and 
tensor gravitational potentials remain approximately constant for modes outside the horizon, the vector 
gravitational potentials W4 tend to redshift to zero as the Universe expands, equation (29.51). Thus vector 
perturbations are usually negligible in standard cosmological models, §35.11. 
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29.5 Metric, tetrad connections, and Einstein tensor 


This section gives expressions in a completely general gauge for perturbed quantities in the flat Friedmann- 
Lemaitre-Robertson-Walker background geometry. 


29.5.1 Metric 


The unperturbed metric is the FLRW metric (29.2). The perturbation Juv to the coordinate metric is, 
equation (26.6), 


m=- 2 , (29.22a) 
scalar 
Ina = —0°[Va(w + Ù) + (Wa + Gad] , (29.22b) 
scalar vector 
Gab = —a? [2b fab + 2VaVoh + Valhe + ho) + Volha + ha) +2 has] - (29.22c) 
scalar scalar vector vector tensor 


The coordinate metric is tetrad gauge-invariant, but not coordinate gauge-invariant. 


29.5.2 Tetrad-frame connections 


The tetrad-frame connections l'kmn are obtained from the usual formula (11.54). The non-vanishing unper- 
turbed tetrad-frame connections are 


o a 
Toab ae Gz oa : (29.23) 
The perturbations Demn to the tetrad-frame connections are 
1 ð 4 0 i 
Toan =-|-Va € = (= + i)a) + (= + i)a 3 (29.24a 
a On a On a 
scalar vector 
1 f š 
Toat = -| F bap — VaVo(w = h) = 4(VaWp + VWa) + VoWat hab |, (29.24b 
scalar scalar vector vector tensor 
1 1 i 0 Ze os = 
Pabo = — | 5(Va Wo — VeWa) — — (Eaba Vah — Vaho + Voha)| ; (29.24c 
a vector On scalar vector 
1 1 a. a n 
Tabe = F (beVa = dacV b) ( T Zi) 7, (Sacdba OpeOad) Wa 
scalar vector 
— VelEabd Vah Valth } Vola) H Vahte — Vohac} ; (29.24d) 
scalar vector tensor 


where F is defined by 
F=Ž}+ġ. (29.25) 
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Equations (29.24) show that the perturbations Feia of the tetrad-frame connections depend on all 12 of the 
coordinate gauge-invariant potentials (29.15). The only non-coordinate-gauge-invariant dependence of the 
tetrad-frame connections is on F defined by equation (29.25). The quantity F transforms under a coordinate 
gauge transformation (26.9) as, from equations (29.14), 

a 
2 


d 
FoF-e— > 29.26 
=> €0 dn ( ) 


a 
1 1 1 1 
Thus perturbations Togo, Ioab with a Æ b, Tabo, and Tap. are coordinate gauge-invariant, while the transfor- 
mation (29.26) of F implies that Toa, with a = b transforms under an infinitesimal coordinate transforma- 
tion (26.9) as 


c&o d a 


Doat > oes z Sab : (29.27) 


a dna 


The transformation of the tetrad-frame connections under coordinate transformations can be checked 
another way. According to the rule established in §26.7, the change in a quantity under an infinitesimal 
coordinate gauge transformation equals minus its Lie derivative Le with respect to the infinitesimal coor- 
dinate shift e. Any quantity that vanishes in the unperturbed background has, to linear order, vanishing 
Lie derivative, so is coordinate gauge-invariant. Thus the perturbations Togo, labo, and Pabe are coordinate 
gauge-invariant, confirming the previous conclusion. The only tetrad-frame connections that are finite in the 
unperturbed background, and are therefore not coordinate gauge-invariant, are Poap- Although tetrad-frame 
connections are generically not tetrad-frame tensors, the unperturbed connection Toa = — (@/a?)day, equa- 
tion (29.23), is a tetrad-frame tensor, because the spatial unit matrix ap can be expressed as the tensor 
UmUn +Nmn, Where Um is the tetrad-frame 4-velocity of the Lorentz-transformed tetrad frame relative to the 
rest tetrad frame. The tetrad-frame connections Poa transform as 

€0 da 


Tous + Toa =L ois LeVoad = €*OxT oad = — — -z Îab » (29.28) 
a dna 


in agreement with the transformation (29.27). 


29.5.3 Tetrad-frame Einstein tensor 


The tetrad-frame Einstein tensor Gmn follows from the usual formulae (11.61), (11.78), and (11.80). The 
unperturbed tetrad-frame Einstein tensor Gmn is (equations (29.29) differ from equations (10.29) because 
the time coordinate here is the conformal time 7, not the cosmic time t) 


o a2 
Goo =3 a ; (29.29a) 


Cie =0, (29.29b) 


.. . 2. 
en = (- 2 A + 5) dab . (29.29c) 
a 
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1 
The perturbation Gmn of the tetrad-frame Einstein tensor is 


i ; 
= 5 - 62 F +2V70| , (29.30a) 
a a 
scalar 
1 1 O.. ANA loa ä åN 
c= 2Va(F ( 2 )ū) + v Wat+2(=-25)tia (29.30b) 
2 a a? 2 a a? 
scalar vector 
1 1 ð a a a? 
G= (2( 2 )F 2(2-25) J bap — (VaV — bapV2)(U — 
a. | On z a + a a? ae i ie V ) 
scalar 
1/8 à 0? ao 
foal 25) Va W, + VWa ( 2 V?) ha 29.30 
bolo z ( b + VWa) aye T a b ( c) 
vector tensor 


According to the rule established in §26.7, the variation of the Einstein tensor under a coordinate transfor- 
mation equals minus its Lie derivative, 


1 


Gmn = Ginn = L£-Gmn : (29.31) 


Consequently, as with the tetrad-frame connections, the tetrad-frame Einstein components that vanish in 
the background, namely the off-diagonal components Gmn with m Æ n, are coordinate gauge-invariant, while 
the components that are finite in the background, namely the diagonal components Gmn with m = n, are 
not coordinate gauge-invariant. The variations of the non-coordinate-gauge-invariant Einstein components 
under an infinitesimal coordinate transformation (26.9) are 


LeGoo = €*OxGoo = -——— , (29.32a) 
a dn a‘ 
d 2ä s2 
LeGab = E OkGab = ma a = 2 Oab . (29.32b) 
adn\a> at 


It can be checked that the same transformations of the tetrad-frame Einstein components under a coordinate 
transformation follow from the expressions (29.30) for the perturbed Einstein components and the coordinate 
transformations (29.14) of the potentials. i i 

The time-time and space-space perturbations Goo and Gay are tetrad gauge-invariant, as follows from the 
fact that these components depend only on symmetric combinations of the vierbein potentials. However, the 
time-space perturbations Go, are not tetrad gauge-invariant, as is evident from the fact that equation (29.30b) 
involves the non-tetrad-gauge-invariant perturbations w and Ùa. Physically, under a tetrad boost by a velocity 
v of linear order, the time-space components Goa change by first order v, but Goo and Gab change only to 
second order v2. Thus to linear order, only Goa changes under a tetrad boost. Note that Goa changes under 
a tetrad boost (Ŭŭ and tw), but not under a tetrad rotation (A and ha). 
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29.6 Gauge choices 


Since only the 6 tetrad and coordinate gauge-invariant potentials Y, 6, Wa, and hap have physical significance, 
it is legitimate to choose a particular gauge, a set of conditions on the non-gauge-invariant potentials, 
arranged to simplify the equations, or to bring out some physical aspect. 

This book for the most part uses the conformal Newtonian gauge, §29.8, which is constructed so as to 
retain only physical perturbations. 


29.7 ADM gauge choices 


The ADM (3+1) formalism, Chapter 17, chooses the tetrad time axis yo to be orthogonal to hypersurfaces 
of constant time, 7 = constant, equivalent to requiring that the tetrad time axis be orthogonal to each of 
the spatial coordinate axes, Yo - €a = 0, equation (17.2). The ADM choice is equivalent to setting 


Ü= tig =0. (29.33) 


The ADM choice simplifies the tetrad-frame connections (29.24) and the time-space component Goa of the 
tetrad-frame frame Einstein tensor, equation (29.30b). The ADM lapse a and shift 6° are 


a=a(l+v), B° =Vawtwa. (29.34) 


Another gauge choice that significantly simplifies the tetrad connections (29.24), though does not affect 
the Einstein tensor (29.30), is 


h=h,=0. (29.35) 


If the wavevector k is taken along the coordinate z-direction, then the gauge choice ha = 0 is equivalent to 
choosing the tetrad 3-axis (z-axis) y3 to be orthogonal to the coordinate z and y-axes, -y3-€, = Y3-e, = 0. The 
gauge choice h = 0 is equivalent to rotating the tetrad axes about the 3-axis (z-axis) so that y1-e, = Y2` ex. 


29.8 Conformal Newtonian (Copernican) gauge 


The most physical gauge is one in which the 6 perturbations retained coincide with the 6 coordinate and tetrad 
gauge-invariant perturbations (29.16). This gauge is called conformal Newtonian gauge, analogously to 
the Newtonian gauge of Minkowski space, §27.8. Because in conformal Newtonian gauge the perturbations are 
precisely the physical perturbations, if the perturbations are physically weak (small), then the perturbations 
in conformal Newtonian gauge will necessarily be small. 

I think conformal Newtonian gauge should be called conformal Copernican gauge, for the same reason 
that Newtonian gauge should be called Copernican gauge, §27.8. Dynamically, collapsed systems such as 
galaxies or solar systems are highly nonlinear systems, but gravitationally they are weakly perturbed sys- 
tems. Conformal Newtonian (Copernican) gauge keeps the coordinates aligned with the unperturbed FLRW 


29.8 Conformal Newtonian (Copernican) gauge 777 


comoving coordinates even in highly nonlinear systems. Conformal Newtonian gauge breaks down only in 
gravitationally nonlinear systems such as black holes. 

Conformal Newtonian (Copernican) gauge in an FLRW background makes the same gauge choices as 
Newtonian gauge in a Minkowski background, equation (27.58), 


w=W=%0,=h=h=h,=h,=0, (29.36) 


so that the retained perturbations are the 6 coordinate and tetrad gauge-invariant perturbations (29.16), 


v =%, (29.37a) 
scalar 
č = ¢, (29.37b) 
scalar 
Wa = Wa, (29.37c) 
vector 
hab - (29.37d) 
tensor 


In conformal Newtonian gauge, the quantity F defined by equation (29.25) becomes the coordinate and 
tetrad gauge-invariant quantity 


F= Ey +6. (29.38) 
The conformal Newtonian metric is 
ds? =a? {—(1+2W) dy’? —2W, ddz? + [dan(1 — 28) — 2 hab] dx*dx?} . (29.39) 


Various tetrad-frame connections lemn, equations (29.24), define the accelation acceleration Ka = Taoo 
and extrinsic curvature Kap = Taob = —Toap. The trace, antisymmetric, and traceless symmetric parts of the 
extrinsic curvature define the expansion, vorticity, and shear, equations (18.16), which play a key role in the 
Raychaudhuri equations, §18.2. Also relevant is the precession labo = —T bao (not to be confused with the 
vorticity). In conformal Newtonian gauge, the acceleration, expansion, vorticity, shear, and precession are 


acceleration kg = laoo = : van ; (29.40a 
scalar 

expansion V = -lfa = T — E) F (29.40b 

vorticity Wap = —T jas) = 0, (29.40c 

shear dab = —To(ab) = 55 (VaW, + Vie) : (29.40d 

precession = [abo = 5; (VaW, = Vie) ; (29.40e 
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29.8.1 Conformal Newtonian gauge: energy-momentum conservation 


0 
The unperturbed components T™” of the tetrad-frame energy-momentum comprise the energy density p(n) 
and isotropic pressure p(7) of the FLRW background, 


TP =p, (29.41a) 
0 

7% =0, (29.41b) 
T% = pôu. (29.410) 


The perturbed components T™” of the tetrad-frame energy-momentum are the energy density p, the energy 
flux fa, and the pressure Pab, 


T° =p=pt+op, (29.42a) 
(‘eco (29.42b) 
T® = pab = P Îab + Spar - (29.42c) 


In perturbation theory, the perturbations dp, fa, and dpap are treated as of linear order. The trace of the 
spatial energy-momentum defines the isotropic pressure p, 


slg =p=p+ôp. (29.43) 


In conformal Newtonian gauge, the equations of conservation of energy and momentum are to linear order 


1—W [dp a. 
moO __ LV | | = 
DmT = a E T afa 1 3(p y p)($ è) 0, (29.44a) 
1 fafa i 
Dpt | a ee at (0+ p)Va| =0. (29.44b) 
a | On a 


Notice that the energy-momentum conservation equations (29.44) involve only the scalar potentials UV and 
®, not the vector or tensor potentials W, or hay. The energy equation (29.44a) has only a scalar component, 
while the momentum equation (29.44b) has both scalar and vector components, Exercise 29.3. The energy 
conservation equation (29.44a) has an unperturbed part, 


0 1 | 0p a 
gy = + 3(p+ p)—| =0. 29.45 
bro 21 5 3p+ 0) (29.45) 

Any fluid component that conserves energy-momentum satisfies equations similar to (29.44). For a fluid 
component with equation of state p/p = w = constant, the unperturbed energy conservation equation (29.45) 
recovers the usual result that p x a~30+¥), 


Concept question 29.3. Scalar, vector, tensor components of energy-momentum conservation. 
What are the scalar, vector, and tensor components of the energy-momentum conservation equations (29.44)? 
Answer. The energy conservation equation (29.44a) contains only scalar components. The momentum con- 
servation equation (29.44b) contains scalar and vector components, but no tensor component. The scalar 
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component of the pressure is the sum of an isotropic part pôab, and a traceless quadrupole part which is 
discussed in §32.6. The vector component of the pressure takes the form 


Pab = VapL,b + VoPL,a ; (29.46) 


vector 


where p] a is transverse, Vapi ,a = 0. The vector part of the momentum conservation equation (29.44b) is 


ofa 
On 


1 
DmT™ == ( 


vector a 


a 
+47 fiat Vp.) =0. (29.47) 
If the vector pressure is negligible, p, a = 0, then the vector momentum conservation equation (29.47) implies 
that 


fica. (29.48) 


The tensor component pi, of the pressure is traceless and transverse. Being traceless, pi, makes no contri- 
bution to the isotropic pressure p, and being transverse, it satisfies Vop., = 0. Consequently the energy- 
momentum conservation equations (29.44) contain no tensor component. 


29.8.2 Conformal Newtonian gauge: scalar Einstein equations 


In conformal Newtonian gauge, the scalar perturbations of the Einstein equations are, from the expres- 
sions (29.30) for the Einstein tensor, the energy density, energy flux, monopole pressure, and quadrupole 
pressure equations, 


2% “F — KO = AnGa?T , (29.49a 
ikF = 4nGa? ba T” , (29.49b 
Pye ra arn a 6) = 4 Gna? ban T® , (29.49¢ 
a a a 3 3 
k?(W — ©) = 81Ga? E kako — 4 ban) Te (29.49d 
The perturbation overscript 1 has been omitted from the right hand sides of equations (29.49b) and (29.49d 


since the unperturbed energy-momentum vanishes for these components. All 4 of the scalar Einstein equa- 
tions (29.49) are expressed in terms of gauge-invariant variables, and are therefore fully gauge-invariant. 

If the energy-momentum tensors of the various matter components are arranged so as to conserve overall 
energy-momentum, as they should, then 2 of the 4 equations (29.49a)—(29.49d) are redundant, since they 
serve simply to enforce conservation of energy and scalar momentum. Usually the 1st equation, the energy 
equation (29.49a), and the 4th equation, the quadrupole pressure equation (29.49d), are most convenient 
to retain. But sometimes the 2nd equation, the scalar momentum equation (29.49b), is more convenient in 
place of the energy equation (29.49a). 
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29.8.3 Conformal Newtonian gauge: vector Einstein equations 


The vector (spin-1) Einstein equations in conformal Newtonian gauge are, from the expressions (29.30) for 
the Einstein tensor, 


V?Wa = —16rGa? T” , (29.50a) 
vector 
ə , 
(< +22) (Va Ws + Vo Wa) = 167Ga? Te (29.50b) 
an a vector 


If the overall matter energy-momentum is conserved, as it must be, then either equation (29.50a) or equa- 
tion (29.50b) can be discarded as redundant, since the two equations together serve to enforce conservation 
of (the vector components of) overall energy-momentum. 

In the absence of a vector source of pressure, Tab = Pab = 0, the Einstein equation (29.50b) ensures 


vector vector 


that the vector perturbation redshifts as a~?, 


W, xa? if T% =0. (29.51) 


vector 


The same conclusion follows from the other vector Einstein equation (29.50a). If the vector pressure vanishes, 


then the vector momentum conservation equation (29.47) ensures that the vector energy flux n = fila 
vector i 


redshifts as f1 a x a~*, which when plugged into the Einstein equation (29.50a) implies Wa « a~?. 

In practice, collisions in the early post-inflation Universe tend to isotropize particle distributions, driving 
not only the pressure but also the bulk velocity to zero, as discussed in more detail in §35.11. If the bulk 
velocity vanishes, so fa = 0, then the Einstein equation (29.50a) forces the vector potential to vanish, Wa = 0. 

The tendency of vector perturbations to redshift away has the consequence that vector perturbations are 
usually negligible in standard cosmological models. 


29.8.4 Conformal Newtonian gauge: tensor Einstein equations 


The tensor (spin-2) Einstein equations in conformal Newtonian gauge are, from the expressions (29.30) for 
the Einstein tensor, 
ə? ao 
(5 7 cae V?) ha ——8nGa2 T® . (29.52) 
On? a On tensor 
Whereas vector perturbations necessarily redshift as Wa x a~? in the absence of a source, tensor pertur- 
bations hap, at superhorizon wavelengths kn « 1 have a solution where they are constant, 
hab = constant for kn &«1 if T® =0. (29.53) 


tensor 


Inflation generates tensor modes, which describe gravitational waves. In contrast to vector modes, long 
wavelength gravitational waves generated during inflation can survive to the present time. Gravitational 
waves leave an observable imprint in the B-mode polarization of the cosmic microwave background. A 
detection of B-mode polarization was claimed by by the BICEP2 collaboration (Ade et al., 2014), but the 
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signal may have been from aligned galactic dust rather primordial (Ade et al., 2015). The cosmic gravitational 
wave background could potentially be observed directly in the future. 


Exercise 29.4. Evolution of tensor perturbations (gravitational waves) in FLRW spacetimes. 
Show that equation (29.52) can be rewritten in Fourier space 


ü k\ lah = 3 mab 
on E + ) (a hab) = —87Ga T ; (29.54) 


What is the solution of equation (29.54) if there is no tensor source, T = 0, and the background energy- 
ensor 


momentum is dominated by a species with equation of state p/p = w = constant? Plot the solution in the 
radiation-dominated regime, subject to the condition that ha» is initially finite. 

Solution. From equation (10.83) it follows that, for background energy-momentum dominated by a single 
species with p/p = w = constant, 


a 2 a 2(1 — 3w) 
= = 29.55 
a (1+3w)n’ a (1+3w)?n? ( ) 
The tensor evolution equation (29.54) in the absence of sources becomes 
Pr 2241-38 
CE a ah bi (29.56) 


0n? (1 + 3w)?1? 


The solution of equation (29.56) is a linear combination of Bessel functions Ji, (for w < —1/3, replace n 
with its magnitude |7|), 


hav = (kn) ~” [Ay Jn (len) + A- Jn (Kn)) (29.57) 
of argument 
3(1— w) 
= ; 29.58 
"= 9. + 3w) aut) 
Special cases are 
i uet, 
n= 3 w=0, (29.59) 
-3 w=-l1, 


in which case the solution reduces to spherical Bessel functions. The solution that is finite at 7 — 0 is, for 
n > 0, the A; component. Normalized to 1 at 7 = 0, the finite solution is 


kn\ 7" 1 kn<1, 
n 
ha =r(1 +n) (=) Jn(kn) > ra k —(n+1/2) (29.60) 
2 ae (=) cos [kn —(n+4)m/2] kn >1. 
Since 
Qt? xa, (29.61) 
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Figure 29.1 Evolution of the tensor potential hg», in the radiation-dominated regime, where w = 1/3, n = 1/2, and 
ha» x sin(kn)/(kn). 


the solution goes as hay ~ a~! cos(kn + constant) at large kn. Physically, the gravitational wave amplitude 
hap is constant well outside the horizon, kn < 1, while it redshifts as 1/a well inside the horizon, kn > 1. 
Figure 29.1 illustrates the evolution of the tensor potential ha» in the radiation-dominated regime. 


29.9 Conformal synchronous gauge 


One gauge that remains in common use in cosmology, but is not used here, is conformal synchronous gauge, 
discussed in the case of Minkowski background space in §27.9. The cosmological synchronous gauge choices 
are the same as for the Minkowski background, equations (27.65) and (27.66): 


Y = w = Ù = Wa = Wea =h=h,=O0. (29.62) 


The gauge-invariant perturbations (29.16) in synchronous gauge are 


v = (2 += )i, (29.63a 
scalar On a 

ð =-Ĉh, (29.63b 
scalar a 

W, = ha, (29.63c 
vector 

hab - (29.63d 


tensor 
Like synchronous gauge, §27.9, conformal synchronous gauge chooses a coordinate system and tetrad that 


is attached to the locally inertial frames of freely falling observers. Thus synchronous gauge follows the 
frame of cold collisionless matter (“dust”). To the extent that non-baryonic cold dark matter has always been 
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cold and collisionless (not quite true, but an excellent approximation), synchronous frame is the frame of 
non-baryonic cold dark matter. 

Conformal synchronous gauge fails at non-linear scales where collisionless matter has turned around and 
collapsed into galaxies and galaxy clusters. This contrasts with conformal Newtonian gauge, which holds as 
long as gravitational perturbations remain weak, including in highly non-linear collapsed systems such as 
galaxies and solar systems. 


Concept question 29.5. What frame does the CMB define? Answer. The CMB frame is the frame 
where the CMB temperature is constant, and CMB photons have zero bulk velocity. That statement de- 
pends on the scale (in Fourier space, the wavenumber k) over which the CMB temperature or velocity is 
averaged. For adiabatic fluctuations at superhorizon scales, all particle species start with essentially the same 
overdensity and velocity. The initial frame that comoves with particles is, by construction, the synchronous 
frame. Once a scale comes inside the horizon, different components that are not kept coupled by collisions 
(non-baryonic dark matter, photons, neutrinos) evolve differently, as illustrated for example by Figure 33.1. 
At scales well inside the horizon, the bulk velocity of free-streaming relativistic particles in conformal New- 
tonian gauge tends to zero in oscillatory fashion, again as illustrated by Figure 33.1. Thus at subhorizon 
scales conformal Newtonian gauge provides a good approximation to the frame of CMB photons. 


Concept question 29.6. Are congruences of comoving observers in cosmology hypersurface- 
orthogonal? Comoving observers are defined to be those at rest in the tetrad frame, u™ = {1,0,0,0}. 
The worldlines of comoving observers define a timelike congruence. Are congruences of comoving observers 
hypersurface-orthogonal, §18.6? Answer. Common cosmological gauges, including conformal Newtonian or 
conformal synchronous, impose the ADM gauge condition that the time axis yo is orthogonal to hypersurfaces 
of constant time t, §29.7. This ADM condition (coupled with the general relativistic assumption of vanishing 
torsion) implies that vorticity vanishes, which is one of the two conditions for a timelike congruence to be 
hypersurface-orthogonal, §18.6. The other condition for a timelike congruence to be hypersurface-orthogonal 
is that the congruence be geodesic; this is true in the specific case of conformal synchronous gauge, but not 
for other gauges, such as conformal Newtonian gauge. 
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Cosmological perturbations: a simplest set 
of assumptions 


The purpose of this Chapter is to set forward the simplest approximate model of the development of pertur- 
bations to matter and radiation in our Universe. 


The model consists of two non-interacting perfect fluids, non-baryonic cold dark matter with a pressureless 
equation of state p/p = 0, and radiation with a relativistic equation of state p/p = 1/3. The model neglects 
baryons, since their energy density is sub-dominant, being 0,/Q. ~ 1/5 of the dark matter density. The 
model lumps neutrinos with photons, neutrinos being relativistic with energy density about two thirds that 
of photons, p,/py = 67 (Aya = 0.68. It would be wrong to lump baryonic perturbations with those of 
non-baryonic dark matter, since prior to recombination electron-photon scattering keeps the baryonic fluid 
tightly coupled to photons, preventing the baryons from clustering gravitationally like the non-baryonic cold 
dark matter. In the simple approximation, recombination occurs abruptly at a redshift 1+ zrec % 1100. After 


recombination, baryons can cluster gravitationally, forming galaxies, stars, and eventually people. 


Well after recombination, a third energy component, dark energy, becomes important. It too can be treated 
as a perfect fluid, with equation of state p/p = —1. 


The perfect fluid approximation keeps only the lowest momentum moments of the particle distributions, 
the energy density and the bulk velocity, along with an isotropic pressure p that is a given function of density p 
in the rest frame of the fluid. The evolution of a perfect fluid is determined entirely by the energy-momentum 
conservation equations that the fluid satisfies. 


The model includes only scalar modes. The quadrupole pressure vanishes for perfect fluids, so the two 
scalar potentials are equal, ¥ = ®, equation (29.49d). However, the two scalar potentials will often be kept 
separate in this Chapter, to facilitate later reference. Tensor modes (gravitational waves) are neglected, since 
their energy-momentum is sub-dominant. Tensor modes leave a distinctive imprint on the polarization of the 
CMB, which is addressed in Chapter 36. 
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30.1 Perturbed FLRW line-element 


The perturbed FLRW line-element in conformal Newtonian gauge, equation (29.39), including only scalar 
perturbations, is 


ds? = a? [-(1 + 2W)dn? + ôab(1 — 28)dx*dz"| , (30.1) 


where a(n) is the cosmic scale factor, a function only of conformal time n. 


30.2 Energy-momenta of perfect fluids 


In the simplest approximation, each component of the cosmological energy-momentum, including matter, 
radiation, and dark energy, can be treated as a perfect fluid, that is, a fluid whose pressure is isotropic in 
the rest frame of the fluid. The tetrad-frame energy-momentum tensor of a perfect fluid with proper density 


p and isotropic pressure p in its own rest frame, moving with bulk 4-velocity u™ = dx™/dr relative to the 
conformal Newtonian tetrad frame, is 


T™” =(pt+p)u"u" +p . (30.2) 


It is a good approximation to assume further that the equation of state of the fluid is such that its proper 
pressure p is some prescribed function of its proper density p (such a fluid is called barotropic), 


p= p(p) . (30.3) 
Define w to be the derivative 
d 
wa ; (30.4) 
dp 


which proves to be (at least for w > 0) the square of the sound speed of the fluid in units of the speed of 
light. In the simple model considered in this Chapter, each of the fluids considered, matter, radiation, and 
; Z, and —1 respectively. Chapter 32 considers the more realistic 
situation of a photon-baryon fluid with non-constant w. 


dark energy, has constant w, with w = 0 


Each fluid moves with non-relativistic bulk velocity, including radiation, which is almost isotropic, and 
therefore has a small bulk velocity even though individual particles of radiation move at the speed of light. 
The bulk tetrad-frame 4-velocity u™ of the fluid is thus, to linear order 


u” = {1, Va} , (30.5) 


where va is its non-relativistic spatial bulk 3-velocity (the spatial tetrad metric is Euclidean, so v° = va). 
The bulk velocity vg is to be considered as of linear order, so its square vanishes to linear order. 

The proper fluid density p can be written as a sum of an unperturbed density J and a linear order 
fluctuation dp, 


p=ptop. (30.6) 
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It proves advantageous, because it simplifies the resulting perturbation equations (30.13), to characterize the 
density fluctuation dp in terms of a fluctuation 6 


dp = (P + P)ô , (30.7) 


where p = p(p) in the unperturbed pressure. As you will discover in Exercise 30.1, the fluctuation ô can be 
interpreted physically as the entropy fluctuation, 


a (30.8) 


For matter, where p = 0, the entropy fluctuation coincides with the density fluctuation, ô = dp/p. For 
dark energy, where p = —p, the density fluctuation is necessarily zero, ĝp = 0, reflecting the fact that 
vacuum energy cannot cluster. To linear order in the bulk velocity va, the tetrad-frame energy-momentum 
tensor (30.2) of the perfect fluid is then 


T° = p=ptip, (30.9a) 
T” = fa =(Pt+D)va , (30.9b) 
T™® = pba = (D+ Ôp) dab ; (30.9c) 


where the pressure fluctuation dp is, from equation (30.4), 
op=wo0p. (30.10) 


If a species does not exchange energy or momentum with other species, then it satisfies the energy- 
momentum conservation equations (29.44) in conformal Newtonian gauge. Subtracting appropriate amounts 
of the unperturbed energy conservation equation (29.45) from the perturbed energy-momentum conservation 
equations (29.44) yields equations for the entropy fluctuation ô and bulk velocity va of the fluid (recall that 
overdot denotes partial differentiation with respect to conformal time 7, equation (29.5), so for example 


ò = 06/dn), 


b+ Vava = 36, (30.11a) 


roe a ee 3w) va +wVd =—Val . (30.11b) 


Physically, equation (30.11a) represents conservation of entropy, while equation (30.11b) represents conser- 
vation of momentum. 

Now decompose the bulk 3-velocity va into its scalar v and vector V] a parts. Up to this point, the scalar 
part of a vector has been taken to be the gradient of a potential. But here it is advantageous to absorb a 
factor of k into the definition of the scalar part v of the velocity, so that instead of va = —ikgv+ V,a in 
Fourier space, the velocity is given in Fourier space by 


Va = —tkav+ Vive - (30.12) 


The advantage of this choice is that v is dimensionless, as are ô and W and ®. Note that the comoving 
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wavenumber k (a constant for any given mode) has units of n~t. The scalar parts of the perturbation 
equations (30.11) are then 


6—kv=36, (30.13a) 


v+ (1—3w) ov + wkd = -kY . (30.13b) 
a 


The vector part of equations (30.11) is considered in Exercise 30.3. 
Combining the two equations (30.13) for the scalar fluctuation 6 and scalar bulk velocity v yields a second- 
order differential equation for 6 — 3®, 
d2 
dn? i 


(1 3w) Etuk? (5 — 3) = —k?(V + 3w8) . (30.14) 


Equation (30.14) holds for any perfect fluid that conserves energy-momentum and that has equation of 
state (30.4), with w not necessarily constant. For positive w, equation (30.14) is a wave equation for a 
damped, forced oscillator with sound speed yw. The resulting generic behaviour for the particular cases of 
matter (w = 0) and radiation (w = 4) is considered in §30.5 and §30.6 below. 

A more careful treatment, deferred to Chapter 33, accounts for the complete momentum distribution of 
radiation by expanding the temperature perturbation © = 6T/T in multipole moments, equation (33.47). 
The radiation fluctuation ô, and scalar bulk velocity v, are related to the first two multipole moments of the 


temperature perturbation, the monopole Op and the dipole ©,, by 


Or = 300 3 (30.15a) 
v, = 30, . (30.15b) 


The factor of 3 arises because the unperturbed radiation distribution is in thermodynamic equilibrium, for 
which the entropy density is s x T?, so 6. = 3 ôT/T. 


Exercise 30.1. Entropy perturbation. The purpose of this exercise is to discover that the fluctuation 
ô defined by equation (30.6) can be interpreted as the entropy fluctuation. According to the first law of 
thermodynamics, the entropy density s of a fluid of energy density p, pressure p, and temperature T in a 
volume V satisfies 


d(pV) + pdV = Td(sV) . (30.16) 

If the fluid is ideal, so that p, p, T, and s are independent of volume V, then integrating the first law (30.16) 
implies that 

pV+pV =TsV . (30.17) 


This implies that the entropy density s is related to the other variables by 


_ PtP 
3S SS y 


T (30.18) 
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Show that, for a perfect, barotropic fluid (one in which pressure is a prescribed function p(p) of density p), 
small variations of the density and entropy are related by 
ô ô 

ee (30.19) 

ptp 8 
confirming equation (30.8). [Hint: Do not confuse what is being asked here with adiabatic expansion. The 
result (30.19) is a property of the fluid, independent of whether the fluid is changing adiabatically. For 
adiabatic expansion, the fluid satisfies the additional condition sV = constant.| 
Solution. Use equation (30.18) to eliminate the temperature T from the first law (30.16), obtaining 


d d 
BN E (30.20) 
prp s8 
In the situation being considered, where pressure is a prescribed function p(p) of density, equation (30.20) 


implies equation (30.19). 


Concept question 30.2. Entropy perturbation when number is conserved. The derivation of the 
entropy perturbation (30.19) in Exercise 30.1 was based on the first law of thermodynamics (30.16) without 
any term udN representing number conservation. Should not such a term be included? Answer. This 
question was addressed in Exercise 10.14. Each chemical potential u is associated with a conserved species. 
Terms associated with number conservation can be dropped provided that the fluid contains all particles 
belonging to a conserved species. For example, electrons and positrons can annihilate with each other, so 
the numbers Ne and Nz of electrons e and positrons ē in a comoving volume are not conserved, but their 
sum Ne + Nz is conserved. Electrons and positrons in thermodynamic equilibrium satisfy ua = —He, so the 
terms representing number conservation in the combined electron-positron fluid vanish, 


He dNe + ue dNz = he d(Ne — Ne) =10! (30.21) 


Thus the entropy perturbation equation (30.19) does not hold individually for electrons and positrons, but 
it does hold for the combined electron-positron fluid. 


Exercise 30.3. Vector fluctuation. What is the vector part of the perturbation equations (30.11)? Solve 
it. 
Solution. The vector part of equations (30.11) is 
f a 
VL a + (1 — 3w)-V La =0. (30.22) 
a 
If w is constant, the solution is 


Vla x a773) , (30.23) 
Together with p x a~3(+™), equation (30.23) implies 


fia = (B+P) a xat, (30.24) 
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which agrees with the vector component of the momentum conservation equation (29.44b) for any combina- 
tion of perfect fluids (which have vanishing vector component pap of pressure). 


30.3 Entropy conservation at superhorizon scales 


At superhorizon scales, where kn < 1, the bulk velocity term in equation (30.13a) for the entropy fluctuation 
ô is negligible, and the equation reduces to 


Å= 3È. (30.25) 
It is conventional to define a quantity ¢ by 
¢= 36 - Ë, (30.26) 


which has the property that it is constant at large scales in any fluid component that does not exchange 
energy with other components, 


Ç = constant if kn «1. (30.27) 


Since both 6 and ® are gauge-invariant (all quantities in Newtonian gauge being gauge-invariant), so also 
is ¢. 

Physically, the constancy of ¢ at superhorizon scales is associated with a conservation law that has the 
appearance of a law of conservation of entropy. Recall that in a FLRW universe, the Einstein equations 
enforce a conservation law (10.33) that looks like the first law of thermodynamics with conserved entropy. 
The constancy of ¢ is a generalization of this law to superhorizon perturbations of a FLRW universe. An 
observer cannot distinguish a superhorizon perturbation from a strictly FLRW universe (such perturbations 
can be measured only by later observers after the superhorizon perturbation has entered their horizon). 
Specifically, an observer inside a horizon patch can perform a global transformation (29.18) of the cosmic 
scale factor a (and time coordinate 7) so as to set the large-scale ® (and Y) to zero in their patch. Then 
equation (30.25) becomes È = 0, expressing the first law of thermodynamics (10.33) in the FLRW background 
of the patch. 

The —3© part of the conserved fluctuation 6 — 3@ is associated with the transformation between comoving 
and proper volumes, and the fact that the proper spatial volume element is a3(1—3)d?x!?° (which remains 
true when not only scalar but also vector and tensor fluctuations are included). 


Exercise 30.4. Relation between entropy and ¢. Assume that the proper pressure p(p) is a definite 
function of proper density p. Define entropy s per unit volume by (see Exercise 30.1) 


Ins= | ——. (30.28) 


Confirm that, if the bulk peculiar velocity can be neglected so that the energy flux is zero, fa = 0, as is true 


790 Cosmological perturbations: a simplest set of assumptions 
at superhorizon scales, then the energy conservation equation (29.44) in Newtonian gauge reduces to 
dlns +3dln[a(1— ©)] =0, (30.29) 


whose unperturbed part is 


dins+3dlna=0. (30.30) 
Conclude that energy conservation implies the conservation of ¢ defined by 
¢ = į n(s/5)- 8. (30.31) 


Concept question 30.5. If the Friedmann equations enforce conservation of entropy, where 
does the entropy of the Universe come from? Friedmann’s equations enforce conservation of entropy, 
equation (10.33). The constancy of ¢ is a generalization of this law to evolution at superhorizon scales, 
Exercise 30.4. But the entropy of the vacuum as a mode exits the horizon is tiny, and the entropy of the 
matter-radiation fluid when a mode re-enters the horizon is large, yet no entropy has been created because ¢ 
is constant. How can these viewpoints be reconciled? Answer. The first law (10.33) can be construed as an 
equation representing conservation of entropy only if the system is evolving through states of thermodynamic 
equilibrium. The expanding Universe is not a system in thermodynamic equilibrium, even when its geometry 
is precisely FLRW. For systems not in thermodynamic equilibrium, the first law of thermodynamics (10.33) 
enforced by the Friedmann equations simply represents conservation of energy in a general relativistic context. 
The proof in Exercise 30.4 that ¢ represents a fluctuation in entropy depended on the proposition that the 
proper pressure p(p) is a definite function of proper density p. But in a system that is not in thermodynamic 
equilibrium and that evolves irreversibly from one state to another, the pressure is not a definite function 
of density. Reheating, the transition between vacuum and particle energy that marks the end of inflation, 
represents an irreversible (explosive!) increase in entropy. If the expansion of the Universe were reversed, 
the collapsing Universe would not revert from particle energy to vacuum energy, since that would require 
a reduction of entropy, in violation of the second law of thermodynamics. Reheating is analogous to the 
situation of a fluid that passes through a shock front. The shock converts kinetic into heat energy, increasing 
the entropy of the fluid, while conserving its energy. 


30.3.1 Primordial curvature fluctuation 


It was remarked above, §30.3, that an observer inside a horizon patch can perform a global gauge transfor- 
mation (29.18) so as to set the large-scale ® to zero in their patch. Alternatively, the observer has the gauge 
freedom to set the large scale fluctuation 6 in their patch to zero, in which case ¢ = —®. For this reason, the 
total conserved fluctuation Ç is commonly called the primordial curvature fluctuation. 

The constancy of the primordial curvature fluctuation ¢ at superhorizon scales makes it useful for charac- 
terizing fluctuations during inflation. At the end of inflation, the “vacuum” energy-momentum of the inflaton 
field converts to the energy-momentum of matter and radiation. The details of this event, called reheating, 
are not well understood. However, since Ç is constant, its value when a fluctuation first exits the horizon 
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during inflation equals its value when the fluctuation reenters the horizon some time later. The constancy of 
¢ makes the details of reheating largely inconsequential to the evolution of perturbations. 


30.3.2 Adiabatic and isocurvature initial conditions 


The conserved fluctuation in any particular species x that does not exchange energy with other species is 
denoted ¢, with a subscript x. The conserved fluctuation over all species is denoted Ç with no subscript. A 
generic prediction of inflation is that the conserved fluctuation ¢, is the same for all species zx, 


Ga =¢. (30.32) 


Fluctuations in which the fluctuation is the same for all species are said to be adiabatic. 

There are also isocurvature fluctuations, in which the entropy fluctuations 6, of different species oppose 
each other so as to make zero contribution to the curvature potential ©. Among N species, there are 1 
adiabatic and N — 1 isocurvature modes subject to the condition that the initial fluctuations are finite. 


30.4 Unperturbed background 


The evolution of the cosmic scale factor a as a function of conformal time 7 depends on the energy-momentum 
content of the unperturbed background FLRW geometry. Much of this Chapter is concerned with an epoch 
starting somewhat after electron-positron annihilation at a redshift 1+ z ~ 10°, and ending somewhat after 
recombination at 1+ zrec œ% 1100. During this time the Universe was dominated by matter (w = 0) and 
radiation (w = 1/3), transitioning from radiation- to matter-dominated at a redshift of 1+ zeq œ 3400. 

In the unperturbed background, the unperturbed dark matter density Je and radiation density p, evolve 
with cosmic scale factor as 


pea, proa. (30.33) 
The Hubble parameter H is defined in the usual way to be 
=-—_—=— (30.34) 


in which overdot represents differentiation with respect to conformal time, à = da/dn. The Friedmann 
equations for the background imply that the Hubble parameter for a universe dominated by dark matter 
and radiation is 


81G | H? [a ae 
H? = z3 (Be + pr) aa a ( 3 + 7 ’ (30.35) 


where aeq and Heq are the cosmic scale factor and the Hubble parameter at the time of matter-radiation 
equality, pc = Pr- 
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The comoving horizon distance 7 is defined to be the comoving distance that light travels starting from 
zero expansion. This is 


ad 2V2 2/2 : 
ee v2 ( HE 1) = v a| doq , (30.36) 
o H deg, teq AeqHeg | 1+ y1 + a/acq 


eq 
The horizon distance Neq at matter-radiation equality a = aeq is 
2V2 
(1+ V2)deqHeq 
Equation (30.36) inverts to give the cosmic factor a as a function of the horizon distance n, 


ee (4 -4y2) (30.38) 


GQeq E SNeq Neq 


Neq = (30.37) 


In the radiation- and matter-dominated epochs respectively, the comoving horizon distance 7 is 


2 1 2)N"e 
a ( e ) = (as a a ( i ) xa radiation-dominated , 
Geq!teq \ Geq Geq 
= n er, a Vt? (30.39) 
( ) = (1+ V2) neq ( ) x al/? matter-dominated . 
AeqHeq Qeq Geq 


The ratio of the comoving horizon distance 7 to the comoving Hubble distance 1/(aH) is 


2,4/1 + a/deq 
naH = ——— = , (30.40) 
1+ y1 +a/aeq 
which is evidently a number of order unity, varying between 1 in the radiation-dominated epoch a < aeq, 
and 2 in the matter-dominated epoch a > deg. 


Concept question 30.6. What is meant by the horizon in cosmology? See §10.21. 


Exercise 30.7. Redshift of matter-radiation equality. 
1. Argue that the redshift zeq of matter-radiation equality is given by 


1+ eq = E =? O,h?, (30.41) 
eq 


where Qm is the matter density today relative to critical. What is the factor, and what is its nu- 
merical value? The factor depends on the energy-weighted effective number of relativistic species gp, 
equation (10.152b). Should this g, be that now, or that at matter-radiation equality? 

2. Show that the ratio Heq/Ho of the Hubble parameter at matter-radiation equality to that today is 


H 
a = 4/20m (1 + Zag)?” . (30.42) 
0 
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Solution. The redshift z.q of matter-radiation equality is given by 


14 


Om — 45c°R3 Qn HE nh? Go \—! { Mnf? 
q = w= m0 — 8,093 x 104 = 3400 (2>) ue 30.43 
*ea O, ~ 4n8Gg, (kT)! i Ip 3.36 0.143) ’ ones) 


where Tọ = 2.725K is the present-day CMB temperature, and g, = 2 + 67 aye = 3.36 is the energy- 
weighted effective number of relativistic species at matter-radiation equality, equation (10.152b). The value 
OQmh? = 0.143 + 0.001 is from Aghanim et al. (2018). 


30.5 Generic behaviour of non-baryonic cold dark matter 


Non-baryonic cold dark matter is pressureless, w = 0, and it conserves energy-momentum because it does 
not scatter off radiation or baryons. Equation (30.14), which expresses energy-momentum conservation of a 
fluid, reduces for w = 0 to 


da ad 2 
(Fs + t) (ô. — 38) = —k*V . (30.44) 
If Y = Ọ, then the source on the right hand side is —k?®. 

In the absence of a driving potential, © = 0, the dark matter velocity would redshift as ve « 1/a, 
equation (30.13b), and the dark matter density would then evolve as bc = kve x a™!, equation (30.13a). 
In the radiation-dominated epoch, where 7 œ a, this leads to a logarithmic growth in the overdensity ôe, 
even though there is no driving potential, and the velocity is redshifting to a halt. In the matter-dominated 
epoch, where 7 x a!/?, the dark matter overdensity 5, would freeze out at a constant value, in the absence 
of a driving potential. 

More generally, equation (30.44) is a linear differential equation for ĝe — 3® driven by a potential Y. You 
will find the solution to this equation for a prescribed potential Y in Exercise 30.8. 


Exercise 30.8. Generic behaviour of dark matter. Find the homogeneous solutions of equation (30.44) 
for 6-—3® with horizon distance 7 related to cosmic scale factor a by equation (30.36). Hence find the retarded 
Green’s function of the equation. Write down the general solution of equation (30.44) as an integral over the 
Green’s function. Solve for the case of constant potential W. 

Solution. The general solution of equation (30.44) is, in units aeq = 1, 


ae! 


T a d 
õela) — 3®(a) = Ap + Ay nz + ar? f W(a') In (=) a? , (30.45) 
0 x x 
where Ao and A are constants, and 
1 dn a n 
=e = = F 30.46 
i (5 |) (1+ V/I+a)? +4/2 ( ) 
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which simplifies to > a/4 as a > 0 and x > 1 — 2/\/a as a > œ. In the radiation-dominated and 
matter-dominated regimes, equation (30.45) reduces to 


a / 
Ao + Ay In(a/4) + aie Y (a) In (<) ada’ (a<1), 
5-(a) — 3®(a) 3 0 : a (30.47) 
Bo — 2B1a 4 ae f vo(1 — v) da’ (a>1), 
0 a 


where the constants Bo and Bı in the a > 1 expression will usually differ from Ap and A; thanks to 
contributions to the integral at a’ <1 that are not given correctly by the a’ >> 1 approximation. 


30.6 Generic behaviour of radiation 


Before recombination, photons are tightly coupled to baryons through non-relativistic electron-photon (Thom- 
son) scattering. The photon-baryon fluid thus behaves as a single energy-momentum conserving fluid. In the 
simple limit of negligible baryon density, the photon-baryon fluid can be treated as a relativistic fluid with 
w = 1/3. Equation (30.14) then reduces to 


ee + r) (Op — 8) = — k? (Y +8). (30.48) 
dn 


If Y = Ọ, then the source on the right hand side is just —2k?®. 
In the absence of a driving potential, Y + ® = 0, the radiation oscillates as Og « e+” with frequency 
w = k/\/3. In other words, the solutions are sound waves, moving at the sound speed 


Ww I 
MEE a E 30.49 
G= 3 (30.49) 
Define the sound horizon distance 7, by 
n 
Ns = Cs) = —= . (30.50) 
v3 
In terms of the sound horizon distance ns, the differential equation (30.48) becomes 
a 2 2 
k $) =- k (V+). .51 
(= i ) (©o — È) ( ) (30.51) 


Equation (30.51) is a linear differential equation for ©ọ — ® driven by a potential Y + ®. You will find the 
solution to this equation for a prescribed potential Y + ® in Exercise 30.9. 
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Exercise 30.9. Generic behaviour of radiation. Find the homogeneous solutions of equation (30.51). 
Hence find the retarded Green’s function of the equation. Write down the general solution of equation (30.51) 
as an integral over the Green’s function. Convince yourself that OQ 9 — ® oscillates about —(U + ©). 
Solution. The general solution of equation (30.51) is, with y = kns, 


y 
@Qo(y) — (y) = Ao cosy + Ay sin y — f [W(y! + &(y’)] sin(y — y’) dy’ , (30.52) 
0 
where Ag and Aj are constants. 


Concept question 30.10. Can neutrinos be treated as a fluid? Since neutrinos stream collisionlessly, 
how can it be legitimate to treat neutrinos as a fluid? Answer. The complete momentum distribution of 
neutrinos is characterized by a full set of multipole moments, which can be solved using the hierarchy (33.91) 
of Boltzmann equations. A fluid approximation amounts to keeping the first three momentum moments, the 
energy, bulk velocity, and pressure, in the multipole expansion of the momentum distribution. The Einstein 
equations depend only on these moments. If an adequate approximation to the pressure can be made, then 
the Boltzmann hierarchy can be truncated. The perfect fluid approximation amounts to approximating 
the pressure as isotropic, and given as a prescribed function of energy. The perfect fluid approximation 
is adequate for photons, which are isotropized by collisions, but is poor for neutrinos. As a result of free 
streaming, neutrinos develop a quadrupole (anisotropic pressure), as well as higher multipoles. You will 
discover in Exercise 32.7 that, in contrast to photons which behave as a fluid with sound speed /1/8 times 
the speed of light, neutrinos more closely approximate a fluid with sound speed equal to the speed of light. 
Thus the simple approximation of the present Chapter is not really adequate for neutrinos. 


30.7 Equations for the simplest set of assumptions 


The equations for two perfect fluids consisting of matter (w = 0) and radiation (w = 1/3) in a perturbed 
FLRW universe comprise 5 equations as follows. The first two equations express conservation of energy and 
momentum for non-baryonic cold dark matter (subscript c): 


bo —kve = 36, (30.53a) 
ict ove =k. (30.53b) 
a 


The next two equations express conservation of energy and momentum for radiation (subscript r), which 
includes both photons and neutrinos: 


Ò- kO =Ë, (30.54a) 


> k 
01+ 5O0=-5¥. (30.54b) 
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The final equation is the Einstein energy equation (29.49a): 
= ‘ F —k?® = 4nGa? (pede + 40:90) , (30.55) 
where F, equation (29.38), is 
Poiute. (30.56) 


In place of one of the equations (30.53)—(30.55) it is sometimes convenient to use the Einstein momentum 
equation (29.49b) 


— kF = 4rGa? (peve + 4091) , (30.57) 
which, because the matter and radiation equations (30.53) and (30.54) already satisfy covariant energy- 
momentum conservation, is not an independent equation. In the simple approximation of perfect fluids con- 


sidered here, the radiation quadrupole vanishes, and then the Einstein quadrupole pressure equation (29.49d) 
implies that the scalar potentials V and ® are equal, 


V=6. (30.58) 


Exercise 30.11. Program the equations for the simplest set of cosmological assumptions. Write 
computer code that integrates numerically the evolution equations (30.53)—(30.55). In Exercise 32.2 you will 
generalize this code to include more components and more processes, so you should write the code in a 
well-structured fashion that allows you to update it easily. It is theoretically and numerically advantageous 
to treat ôe — 3® and Oo — ® as dependent variables, rather than ôe and Oo. I found it convenient to use 
Ina as the integration variable, and to work in units aeq = Heq = 1. Assume adiabatic initial conditions, 
Çe = Ġ (see §30.10), and without loss of generality normalize to unit initial amplitudes, Çe = Ġ& = 1. Do the 
computation for a selection of wavenumbers k. Plot O09 — ® and —2® together to bring out the fact that the 
former oscillates about the latter, as expected from Exercise 30.9. A numerical issue you may encounter is 
that your integration routine may get stuck trying to integrate the oscillating radiation monopole and dipole 
once the mode is well inside the horizon, kn >> 1. One strategy is to stop following the photon moments 
after a certain time. Another convenient strategy is to introduce an artificial damping term, by changing the 
radiation dipole equation (30.54b) to 


0, + =(O9 +Y) = -2k KO, , (30.59) 


where « is a dimensionless damping coefficient that becomes large when the fluctuation is well inside the 
horizon, kn > 1, 


k= cekn , (30.60) 


with e some suitably small number (I chose e = 107). To see why the damping term works as claimed, 
combine the radiation monopole and dipole equations into a second order differential equation, and read 
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Figure 30.1 (Top) dark matter overdensity e — 3®, (bottom left) radiation monopole Qo — ®, and (bottom right) 
radiation monopole Oo — ® with artificial damping, as a function of cosmic scale factor a in units ao = 1, in the 
simple approximation, for several wavenumbers k. The cosmological model is a flat ACDM model with concordance 
parameters Qa = 0.69 and Qm = 0.31, and adiabatic initial conditions (see §32.3). The radiation monopole Oo — ® 
(blue) is plotted along with minus twice the gravitational potential, —2® (black), to bring out the fact that the former 
oscillates about the latter, as expected from equation (30.48). Curves are labelled with the comoving wavenumber 
k/(deqHeq) in units of the Hubble distance at matter-radiation equality. For the larger wavenumbers, k/(aeq Heq) = 10 
and 10?, the radiation monopole without damping is truncated (bottom left, dotted lines) to avoid confusing the plot. 
The radiation monopole shown here in the simple approximation may be compared to results in the hydrodynamic 


approximation, Figure 32.3, and using a Boltzmann computation, Figure 33.3. 


1 
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Figure 30.2 (Left) Overdensities 6 — 3®, and (right) bulk velocities v in the simple approximation with artificial 
damping, as a function of cosmic scale factor a/aeq, at wavenumber k/(aeqHeq) = 10, for non-baryonic dark matter 
(c) and radiation (y). The radiation overdensity and bulk velocity are related to their monopole and dipole moments by 
by — 38% = 3(@9 — ®) and vy = 301, equations (30.15). The results may be compared to those from the hydrodynamic 
approximation, Figure 32.1, and a Boltzmann computation, Figure 33.1. 


§32.5. The introduction of damping anticipates, but is not an adequate substitute for, the physical processes 
of damping addressed in Chapter 32. 

Solution. Figure 30.1 shows the dark matter overdensity, radiation monopole, and potential for a flat 
ACDM model with Q, = 0.69 and Q. = 0.31, consistent with Planck parameters (Aghanim et al., 2018). 
and adiabatic initial conditions. The radiation monopole is shown both without (bottom left panel) and with 
(bottom right panel) artificial damping. Figure 30.2 illustrates the overdensity and bulk velocity of each of 
matter and radiation for the same model at an illustrative wavenumber k/(deqHeq) = 10. 


30.8 On the numerical computation of cosmological power spectra 


See Seljak and Zaldarriaga (1996) for a discussion of the numerical computation of the CMB power spectrum. 
Modern codes that compute cosmological power spectra from linear perturbation theory, such as CAMB 
(google it), are impressively fast. With default settings, CAMB takes a few cpu seconds to compute a 
complete CMB power spectrum. CAMB is written in parallelized fortran 90. 

To accomplish its task, CAMB: 
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1. reads cosmological parameters from an input file edited by the user; 

2. calls RecFast (or other code) to compute recombination (Chapter 31); 

3. uses a Boltzmann code (Chapter 33) to calculate the evolution of non-baryonic cold dark matter, baryonic 

matter, photons, and neutrinos at each of ~ 200 wavenumbers k; 
4. pre-calculates tables of spherical Bessel functions je; 
5. computes CMB transfer functions Te(no, k), equation (34.20), by integrating source functions over Bessel 
functions at each of ~ 2000 wavenumbers k and ~ 100 harmonics £ (Chapter 34); 
6. computes the CMB power spectrum Ce(no) today by integrating the squared transfer functions over an 
almost scale-free primordial curvature spectrum, equation (34.35). 
That is a lot of computation. The two most time-consuming steps are step 3, the Boltzmann computation, 
and step 5, the computation of CMB transfer functions. For step 3, CAMB uses the open-source ordinary 
differential equation solver dverk (Hull, Enright & Jackson 1976). Step 5 involves integration over highly 
oscillatory integrands. One could contemplate using some clever mathematical approach to integrate the 
highly oscillatory integrands, but CAMB simply uses a brute-force sum, interpolating pre-computed source 
functions in k-space, and splining over pre-computed spherical Bessel functions. 

Most of the calculations of cosmological perturbations and power spectra reported in this book used Math- 
ematica, a program that I use and value a lot. Sadly, high speed numerical calculations are not Mathematica’s 
forte. One elementary issue is that Mathematica’s inbuilt spherical Bessel functions jẹ are inexplicably slow 
for large £, which is unacceptable given that many thousands of Bessel functions must be evaluated (on 
my 2015 laptop, a single evaluation of j(¢) takes approximately (€/20,000)? cpu seconds). Mathematica’s 
biggest challenge is integrating the highly oscillatory functions in step 5. Mathematica’s numerical integra- 
tion routine NDSolve (or worse, NIntegrate, which treats each integrand in a list separately) evaluates its 
integrands far too often to be efficient. If you choose to program in Mathematica, good luck; but be warned 
that Mathematica assumes control over many details that basic languages like c and fortran leave up to you. 
Working with Mathematica is like trying to persuade a recalcitrant child to perform what seems to be a 
simple task; there is no guarantee who will win the contest of wills. 


30.9 Analytic solutions in various regimes 


Much of the remainder of this Chapter is concerned with obtaining approximate analytic solutions that 
describe the evolution of perturbations of the matter and radiation in various regimes. The aim is to gain 
some intuitive understanding of the solutions to the system of equations (30.53)—(30.58). 

Figure 30.3 illustrates key features in the evolution of perturbations. Evolution is punctuated by the transi- 
tion from radiation- to matter-dominated at 1+ 2q % 3400, and by the transition from opaque to transparent 
at recombination, at 1 + Ze. œ% 1100. Meanwhile the comoving horizon distance 7 increases monotonically. 
Small wavelength perturbations enter the horizon early, during the radiation-dominated regime, while long 
wavelength perturbations enter the horizon late, during the matter-dominated regime. 

One regime not covered by the analytic approximations is perturbations that enter the horizon near the 
epoch of matter-radiation equality. The regime is important because the first few peaks, the most prominent 


800 Cosmological perturbations: a simplest set of assumptions 


peaks, in the CMB entered the horizon around or shortly after matter-radiation equality. Covering this 
regime satisfactorily requires solving numerically the full set of equations (30.53)—(30.55). 

The regimes covered below are: 

1. Superhorizon scales, §30.10. 


2. Radiation-dominated: 
a. adiabatic initial conditions, §30.11; 


b. isocurvature initial conditions, §30.12. 
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Figure 30.3 Various regimes in the evolution of fluctuations. The line increasing diagonally from bottom left to top 
right is the comoving horizon distance 7. Above this line are superhorizon fluctuations, whose comoving wavelengths 
exceed the horizon distance, while below the line are subhorizon fluctuations, whose comoving wavelengths are less 
than the horizon distance. The dashed vertical line at cosmic scale factor aeq ~ a9/3400 marks the moment of matter- 
radiation equality. Before matter-radiation equality (to the left), the background mass-energy was dominated by 
radiation, while after matter-radiation equality (to the right), the background mass-energy was dominated by matter. 
Once a fluctuation enters the horizon, the non-baryonic matter fluctuation tends to grow, whereas the radiation 
fluctuation tend to decay, so there is an epoch prior to matter-radiation equality where gravitational perturbations 
are dominated by matter rather than radiation fluctuations, even though radiation dominates the background energy 
density. The dashed vertical line at arec ~% a9/1100 marks recombination, where the temperature cooled to the point 
that baryons changed from being mostly ionized to mostly neutral, and the Universe changed from being opaque to 
transparent. The observed CMB comes from the time of recombination. 
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Subhorizon scales, §30.13. 

Fluctuations that enter the horizon in the matter-dominated epoch, §30.14. 
Matter-dominated regime, §30.15. 

Baryons post-recombination, §30.16. 

Matter with dark energy, §30.17. 

Matter with dark energy and curvature, §30.18. 
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Figure 30.4 Superhorizon scales. 


At sufficiently early times, any mode is outside the horizon, kn < 1. In the superhorizon limit kn < 1, the 
evolution equations (30.53)—(30.55) reduce to 


bo = 36 , (30.61a) 
Ò= Ë, (30.61b) 
z 3EF = AnGa?(ficdc + 47:90) , (30.61c) 


with F defined by equation (30.56 In effect, the dark matter velocity ve and radiation dipole 0, can be 
treated as negligibly small at superhorizon scales, 


Ve = 0, =0. (30.62) 


The first two of equations (30.61) imply that the dark matter overdensity ô. and radiation monopole Qo are 
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related to the potential ® by 


ib- B=, (30.63a) 
Oo- =Q, (30.63b) 


where Çe and ¢, are constants set by initial conditions. Plugging the solutions (30.63) into the Einstein energy 
equation (30.61c), and replacing derivatives with respect to horizon distance 7 with derivatives with respect 
to cosmic scale factor a, 


O_O. a0 


an “da” Oa’ 


with the Hubble parameter H from equation (30.35) gives the first order differential equation, in units 
Qeq = 1, 


(30.64) 


2a(1 +a)’ + (6+ 5a)® + 46, + 3¢Q.a=0, (30.65) 


where prime ’ denotes differentiation with respect to cosmic scale factor, d/da. The solution to equa- 
tion (30.65) that is finite at a = 0 is 


B= — 30+ GG- EG), (30.66) 

where f(a) is the function 
fei ig 5 oe 16yIFa a(6+a+4V71 +a) 
= a a2 a? a? (1 ae Jia) 


in which the rightmost expression is written in a form that is numerically well-behaved for all a. The function 
f varies from 0 at a = 0 to 1 at a — oo. The initial and final values of the potential ®(a) are 


(30.67) 


0(0) =—2¢,, (late) = —2¢, . (30.68) 


The potential ®(late) is designated “late” because it holds in the matter-dominated regime well after recom- 
bination, but fails when dark energy (or possibly curvature) become important. 

There are adiabatic and isocurvature initial conditions. Inflation generically produces adiabatic fluctua- 
tions, in which matter and radiation fluctuate together, 


¢. = adiabiatic , (30.69) 
so that 
6,(0) = 309(0) = —326(0) =¢, adiabiatic . (30.70) 


Notice that a positive energy fluctuation corresponds to a negative potential, consistent with Newtonian 
intuition. Isocurvature initial conditions are defined by the vanishing of the initial potential, ®(0) = 0, 
requiring 


¢- =0 isocurvature . (30.71) 
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Figure 30.5 Evolution of the scalar potential ® at superhorizon scales, equations (30.72), from radiation-dominated 
to matter-dominated. The scale for the potential is normalized to its value ®(late) at late times a >> aeq- 


The adiabatic and isocurvature solutions for the superhorizon potential ©, equation (30.66), are 


ee eee (30.72a) 
iso = —30cf , (30.72b) 
with f given by equation (30.67). Figure 30.5 shows the evolution of the potential ® from equations (30.72), 


normalized to the value ®(late) at late times a >> aeq. For adiabatic fluctuations, the potential changes by 
a factor of 9/10 from initial to final value. 


30.11 Radiation-dominated, adiabatic initial conditions 


For adiabatic initial conditions, fluctuations that enter the horizon before matter-radiation equality, keq > 1, 
are dominated by radiation. In the regime where radiation dominates both the unperturbed energy and its 
fluctuations, the relevant equations are, from equations (30.54), (30.55), and (30.57), 


Ò- kO =6, (30.73a) 

— 3° F — k? = 167Ga? P Oo0 , (30.73b) 
a 

—kF = 161Ga"p,0, , (30.73c) 


in which, because it simplifies the mathematics, the Einstein momentum equation is used as a substitute 
for the radiation dipole equation. In the radiation-dominated epoch, the horizon distance is proportional 
to the cosmic scale factor, 7 x a, equation (30.39). Inserting @9 and ©; from the Einstein energy and 


804 Cosmological perturbations: a simplest set of assumptions 


superhorizon 
i i 


log(comoving scale 1/k) —> 


log(cosmic scale factor a) —> 


Figure 30.6 Radiation-dominated regime. 


momentum equations (30.73b) and (30.73c) into the radiation monopole equation (30.73a) gives a second 
order differential equation for the potential ®, 

z 4A. k 

®+-0+—O=0. (30.74) 

n 3 

Equation (30.74) describes damped sound waves moving at sound speed 1/,/3 times the speed of light. The 
sound horizon, the comoving distance that sound can travel, is ns = n/v3, the horizon distance n multiplied 
by the sound speed. The growing and decaying solutions to equation (30.74) are 


_ 3h(y) _ 3(siny — y cosy) 


Parow = (30.75a) 
i y y3 
decay = He cosy 7 ee (30.75b) 


where the dimensionless parameter y is the wavenumber k multiplied by the sound horizon distance ns, 


kn 2 k a 
= kn, = L = , 30.76 
a ae a (30.76) 
and je(y) = V7/(2y)Je41(y) are spherical Bessel functions. The physically relevant solution that satisfies 
adiabatic initial conditions, remaining finite as y — 0, is the growing solution 


® = (0) Parow - (30.77) 


The growing solution (30.75a) shows that, after a mode enters the sound horizon the scalar potential ® 
oscillates with an envelope that decays as y~?. Physically, relativistically propagating sound waves tend to 
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Figure 30.7 The potential ® and radiation monopole ©o9 for modes that enter the sound horizon kns = 1 during 
the radiation-dominated regime well before matter-radiation equality, for (top) adiabatic initial conditions, equa- 
tions (30.77) and (30.78), and (bottom) isocurvature initial conditions, equations (30.84) and (30.85). The quantities 
shown are (blue) 69 — ® and (black) —29, to illustrate that the former oscillates about the latter as expected from 
equation (30.48). The difference, (Oo — &) — (—2®) = Oo + Ë, which is the temperature Oo redshifted by the potential 
®, is (for Y = ®) the monopole contribution to the temperature fluctuation of the CMB, equation (34.17). The units 
of ® and Oo are such that ¢; = 1 for adiabatic fluctuations, and Çe = 1 for isocurvature fluctuations. 


suppress the gravitational potential 6. The suppression of the potential is responsible for the turnover in the 
observed power spectrum of matter fluctuations today from large to small scales evident in Figure 30.15. 

The radiation monopole Oo can be inferred either from the Einstein equation (30.73b) with the solu- 
tions (30.75) for the potential ®, or from the Green’s function solution (30.52) in the radiation-dominated 
regime. Either way, the difference Og — ® between the radiation monopole and the potential corresponding 
to the growing mode potential (30.77) is 


2siny — 
Oo =G! = a í (30.78) 
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Figure 30.8 Evolution of the dark matter overdensity c — 3® for a mode that enters the horizon during the radiation- 
dominated regime, for adiabatic initial conditions. Like the radiation fluctuation Oo illustrated in the top panel of 
Figure 30.7, the matter fluctuation c is constant outside the sound horizon, kns < 1, and gets a boost as the 
fluctuation enters the sound horizon. But whereas the radiation fluctuation subsequently oscillates, the dark matter 
fluctuation grows monotonically, with logarithmic growth well inside the sound horizon, kns >> 1. The units are such 
that ¢ =1. 


The top panel of Figure 30.7 shows the growing mode potential ©, equation (30.77), and the radiation 
monopole Qo, equation (30.78). The Figure plots these two quantities in the form —2® and Oo — ® in order 
to bring out the fact that Oo — ® oscillates about —2®, in accordance with equation (30.48). After a mode 
is well inside the sound horizon, y > 1, the radiation monopole oscillates with constant amplitude, 


Oo =—G-cosy fory>1. (30.79) 


Fluctuations in the dark matter are driven by the gravitational potential of the radiation. The radiation- 
dominated Green’s function solution (30.47) for the dark matter fluctuation ôe driven by the growing mode 
potential (30.77) and satisfying adiabatic initial conditions (30.70) is 


sin y 


ôe — 38 = 66, ( — 5 + Cin v) , (30.80) 


where Ciny = ia — cos x) dz/x is the cosine integral. Figure 30.8 shows the density fluctuation (30.80). 
Once the mode is well inside the sound horizon, y >> 1, the dark matter density ô., equation (30.80), evolves 
as, from the asymptotic behaviour Ciny ~ lny + y with y = 0.5772... Euler’s constant, 


1 
ôe — 38 = 66, (my +y- z) fry>l1, (30.81) 


which grows logarithmically. This logarithmic growth translates into a logarithmic increase in the amplitude 
of matter fluctuations at small scales, and is a characteristic signature of non-baryonic cold dark matter. 


30.12 Radiation-dominated, isocurvature initial conditions 807 


Exercise 30.12. Radiation-dominated fluctuations. 

1. Confirm equation (30.74). You might like to start by seeking a solution using the monopole and dipole 
radiation equations (30.54) along with the Einstein energy equation (30.55) including only radiation, 
namely equation (30.73b). Then try the solution advocated in the text, namely use the Einstein mo- 
mentum equation (30.73c) in place of the radiation dipole equation. This is an example of a situation 
where, even though two sets of equations are equivalent, it is easier to find solutions from one set than 
the other. 

2. Confirm that the homogeneous solutions of equation (30.74) are as given in the text, equations (30.75). 

3. The initial condition for the temperature monopole is determined by equation (30.63b), ©o(0) — (0) = 
Cr, where ¢, is some constant, the initial radiation entropy fluctuation set up during inflation. Find the 
initial conditions for the scalar potentials Y and ® from the Einstein energy and quadrupole pressure 
equations at 7 — 0 (in the present simple model, the Einstein quadrupole pressure equation simply sets 
Y = 6). 

4. Confirm that the Green’s function solution (30.52) for Oo—® satisfying the requisite boundary conditions 
is equation (30.78). Plot the solution for Oo — ®, along with —2®. Confirm that 09 — © oscillates around 
—290. 

5. Comment on the behaviour. How do the gravitational potential and temperature monopole evolve once 
a mode is inside the horizon? Can you come up with a physical explanation of what is going on? 
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For isocurvature initial conditions, the matter fluctuation contributes from the outset, |f.d.| > |4f;,Qo| even 
while radiation dominates the background density, pe < Pr- 

To develop an approximation adequate for isocurvature fluctuations entering the horizon well before 
matter-radiation equality, kjeqg >> 1, regard the Einstein energy equation (30.55) as giving the radiation 
monopole Qo, and the Einstein momentum equation (30.57) as giving the radiation dipole ©. Insert these 
into the radiation monopole equation (30.54a), and eliminate the Š. terms using the dark matter density 
equation (30.53a). The result is, in units aeq = 1, 


2k a 


2a(1 + a)” + (8+ 9a)’ +2(1+ )® +6. =0, (30.82) 


where prime ' denotes differentiation with respect to cosmic scale factor a. Equation (30.82) is valid in all 
regimes, for any combination of matter and radiation. 

For isocurvature initial conditions, the radiation monopole and potential vanish initially, 09(0) = ®(0) = 0, 
whereas the dark matter overdensity is finite, 6.(0) = 3¢. # 0. For small scales that enter the horizon well 
before matter-radiation equality, krjeq > 1, the potential ® is small compared to ĝe, while 6, has some 
approximately constant non-zero value up to and through the time when the mode enters the sound horizon, 
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kns = \/2/3ka & 1. In the radiation-dominated epoch, a < 1, but k large and ka ~ 1 so k?a > 1, 
equation (30.82) simplifies to 


Ak? 
2ad" +80 + =e +6,=0. (30.83) 


For constant ĝe = 6-(0) = 3¢., the solution of equation (30.83) vanishing at a = 0 is, with y given by 
equation (30.76), 


3V3¢ 1 — cosy — ysin y + $y” 

V2k y? l 
With units restored, k is k/(deqHeq). The Green’s function solution (30.52) for the difference Oo — ® between 
the radiation monopole and potential driven by the potential (30.84) is 


_ 3V3¢. (1 — cosy — sysiny) 
— V2k y l 


Equations (30.84) and (30.85) are the solution for small scale modes with isocurvature initial conditions that 
enter the horizon well before matter-radiation equality. After a mode is well inside the sound horizon, y > 1, 
the radiation monopole (30.85) oscillates with constant amplitude, 

_ 8V3G 


Qo = sin >1. 30.86 
0 ae Y ( ) 


The lower panel of Figure 30.7 shows the potential ®, equation (30.84), and the radiation monopole Oo 
from equation (30.85), again plotted as 09 — ® and —2® to bring out the fact that Oo — ® oscillates about 
—2®. Whereas for adiabatic initial conditions the radiation monopole oscillated as cos y well inside the sound 
horizon, equation (30.79), for isocurvature initial conditions it oscillates as sin y well inside the sound horizon, 
equation (30.86). 

The solution (30.84) for the potential ® was derived from equation (30.83) on the assumption of constant 
dc. The accuracy of the approximation may be checked by calculating the radiation-dominated Green’s 
function solution (30.47) for 6, driven by this potential, which is 


3— 3cosy — 3ySiyt+ 3y? 

y? i 
where Siy = Ie sin z dz/x is the sine integral. Equation (30.87) shows that 6, — 3® is approximately equal 
to 6.(0) = 3¢ in the radiation-dominated regime a < 1 for all y. The dark matter overdensity ôe itself is 


not constant, because ® varies. However, ® from equation (30.84) is of order aôe(0) for any y, and the small 
order a correction to ôe leads to corrections of next order a? to 6, Og — ®, and 6. — 3, and can be neglected. 


= 


(30.84) 


Oo — ®& 


(30.85) 


õe — 38 = 3¢, (1 +a (30.87) 


30.13 Subhorizon scales 


After a mode enters the horizon, the radiation fluctuation © oscillates, but the non-baryonic cold dark 
matter fluctuation ôe grows monotonically. In due course, the dark matter density fluctuation Jee dominates 
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Figure 30.9 Subhorizon scales. 


the radiation density fluctuation J Oo, and this necessarily occurs before matter-radiation equality; that is, 
\PcOc| > |4p-O0| even though Je < Pr. This is true for both adiabatic and isocurvature initial conditions; as 
noted in §30.12, for isocurvature initial conditions, the dark matter density fluctuation dominates from the 
outset. Even before the dark matter density fluctuation dominates, the cumulative contribution of the dark 
matter to the potential ® begins to be more important than that of the radiation, because the potential 
sourced by the radiation oscillates, with an effect that tends to cancel when averaged over an oscillation. 

Regard the Einstein energy equation (30.55) as giving the dark matter overdensity 6., and the Einstein 
momentum equation (30.57) as giving the dark matter velocity ve. Insert these into the dark matter density 
equation (30.53a) and eliminate the Qo terms using the radiation monopole equation (30.54a). The result is, 
in units deg = 1, 


2a7(1 +a)” + a(6 + 7a)®’ — 26 — 40) = 0, (30.88) 


where prime ' denotes differentiation with respect to cosmic scale factor a. Equation (30.88) is valid in all 
regimes, for any combination of matter and radiation. 

Once the mode is well inside the horizon, kn > 1, the radiation monopole Op oscillates about an average 
value of —® (since Oo — ® oscillates about —2®, as noted in §30.6): 


(Qo) =-®. (30.89) 
Inserting this cycle-averaged value of Oo into equation (30.88) gives the Meszaros differential equation 
2(1 + a)ja?®” + (6 + Ta)ab’ +26 =0. (30.90) 


The solutions of Meszaros’ differential equation (30.90) are a linear combination of growing and decaying 
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solutions 
3 3a 
®grow =- (1+ > 91 
grow Ak2q ( T 2 ) ’ (30 9 a) 
1 1)? 
Pdecay F 14 ae In (Vv ee ) 3v1 ta . (30.91b) 
4k?a 2 a 


A constant factor of —3/(4k?) has been included in the potential, arbitrarily, to simplify the overall factor 
in the resulting solution for the dark matter overdensity 6., equations (30.93). The solutions for 6, driven by 
the growing and decaying potentials (30.91) are, from the Green’s function solution (30.45), in units aeq = 1, 


4k?a 
® 
3 
which holds for both growing and decaying modes. The solutions (30.92) omit possible additional contribu- 
tions from the homogeneous solutions in equation (30.45), but the regime of interest is modes well inside the 
horizon, ka >> 1, and the omitted contributions become dominated by the solutions (30.92) as the cosmic 
scale factor a increases. Explicitly, the growing and decaying modes for 6, — 3® are 


ôs — 30 = — (30.92) 


3 
(5¢ — 3®) grow = 1+ <a, (30.93a) 


(õe — 3®) decay = (1 “4, In [H >] 3v1 


Fa. (30.93b) 


The desired solution for the dark matter overdensity ôe is a linear combination of growing and decaying 
modes, 


Se — 3P = 0 eb — 38) ae + Cdecay (fe — 3P)decay . (30.94) 


The coefficients Cgrow and Cdecay follow from matching to the earlier solutions for 6, — 3® obtained in the 
radiation-dominated regime. For modes that enter the horizon well before matter-radiation equality, a << 1, 
the growing and decaying modes (30.93) simplify to 


(5c — 3®)prow =1, (5c —3®)decay = — n(a/4)-—3 fora<1. (30.95) 


It was found in §30.11 that the potential ® in the radiation-dominated regime oscillated with an envelope 
that decayed as ~ a~?, equation (30.75a), driving a dark matter overdensity that grew as a combination of 
linear and logarithmic parts, equation (30.81). The result (30.95) demonstrates that a potential that is a 
sum of parts proportional to 1/a and Ina/a, albeit reduced in amplitude by a factor of 1/k?, leads to the 
same behaviour of the dark matter overdensity. 

For adiabatic initial conditions, the solution for the dark matter overdensity ôe is the one that matches 
smoothly on to the logarithmically growing solution given by equation (30.81). Matching to the adiabatic 
solution (30.81) for ĝe — 3® well inside the horizon determines the constants 


Cage = GG þh -Z +n (4 D) » Cdecay = —6Cr adiabatic . (30.96) 
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Figure 30.10 After an initial boost on entering the sound horizon, the dark matter overdensity ôe grows logarithmically 
with cosmic scale factor a during the radiation-dominated regime, but then linearly with a after matter-radiation 
equality a = aeq. The curves are labelled with the comoving wavenumber k/(aeqHeq) in units of the Hubble distance 
at matter-radiation equality. The evolution is approximated by the radiation-dominated solution (30.80) at small a, 
and by the Meszaros solution (30.94) at larger a, with crosses marking the transition between the two approximations, 
at the geometric mean of the horizon distance at horizon crossing and matter-radiation equality n ~ \/MhorNeq- 


For isocurvature initial conditions, ĝe — 3® is sensibly constant in the radiation-dominated regime a < 1, 
equation (30.87), and only the growing mode is present, 


Cgrow = 3Ce , Caecay = 0 isocurvature . (30.97) 


At late times well into the matter-dominated epoch, a > 1, the growing mode of the Meszaros solution 
dominates, 


(5. — 3®)grow = 3a, (de —3®)decay = 4a? fora>1, (30.98) 
so that the dark matter overdensity 6, at late times is 
6. — 38 = 3 Corowa fora>1. (30.99) 


The potential ©, equation (30.91a), at late times is constant, 


9 
D=- leow fora > 1. (30.100) 
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The constancy of the potential ®, and the linear growth of the dark matter density ôe, is characteristic of 
the matter-dominated regime. 

Figure 30.10 shows the dark matter overdensity ôe calculated for adiabatic conditions from the radiation- 
dominated solution (30.80) at small a, and the Meszaros solution (30.94) at larger a. The overdensity ôe is 
constant before horizon crossing, receives a boost of growth during horizon-crossing, grows logarithmically 
with cosmic scale factor a during before matter-radiation equality, then grows linearly with a after matter- 
radiation equality. 

For modes that enter the horizon well before matter-radiation equality, the radiation monopole Oọ at late 
times a > 1 is, with y = kn/V3, 

O97 = —®—C, cosy adiabatic , (30.101a) 
-aG 
2V2 k 


Oo =- siny isocurvature . (30.101b) 
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Figure 30.11 Fluctuations that enter the horizon in the matter-dominated regime. 


For fluctuations that enter the horizon well after matter-radiation equality, kneqg < 1, the potential ® before 
entering the horizon is given by the superhorizon solution (30.66), while after entering the horizon the 
evolution of the potential is dominated by the dark matter density fluctuation. A satisfactory solution for 
the potential ® valid both before and after entering the horizon is obtained by setting the radiation monopole 
equal to its superhorizon solution, 09 = ®+¢,, equation (30.63b), and inserting this value into the differential 
equation (30.88). This solution remains an adequate approximation inside the horizon because after horizon 
crossing the radiation fluctuation O09 makes a subdominant contribution to the Einstein energy equation, so 
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its behaviour ceases to influence the evolution of the potential. Mathematically, once a >> aeq, the derivative 
terms ®” and ®’ dominate the ® and Oo terms in equation (30.88). 

Inserting the superhorizon solution O09 = ® + ¢, into the differential equation (30.88) recovers the super- 
horizon solution (30.66) for ®, which therefore remains a satisfactory approximation not only outside but 
also inside the horizon. But inside the horizon, the dark matter overdensity ôe and radiation monopole Og 
driven by this potential are no longer their superhorizon solutions (30.63). Rather, the solution for the dark 
matter overdensity 6. driven by the superhorizon potential, subject to the initial condition $50 — = & is, 
from the Green’s function solution (30.45), 


(30.102) 


ôe — 38 = 3C. + k? f-o + 6) + (8¢, — 8¢.) jem (==) — a|} : 


Well after matter-radiation equality, a >> 1, the dark matter overdensity (30.102) is (note that for large scale 
modes k?a can be small even when a > 1) 
Jo — 3& = 3¢. (1 + ka) = 36e (1+ (kn?) fora>1. (30.103) 


Since ®super (late) = —3¢ for both adiabatic and isocurvature modes, equation (30.68), the overdensity ôe 
from equation (30.103) is 


Jo = Çe (1+ $k*a) = Ece (1 + (kn)”) fora >1. (30.104) 
The solution for O09 — ® driven by the superhorizon potential is, from the Green’s function solution (30.52), 
Oo — & = $0r(4 cosy) 4 (3G: Bo 4+ (4— y2) cosy + 3yk sin y + v|- s 


— ykf(yk+y) + 2g(yet+y) + (yk cosy — 2sin y) f (yk) — (2 cosy + yx sin v)olyx)| } , (30.105) 


where y = kns is the wavenumber times the sound horizon distance, yg = 4,/2/3k is a constant proportional 
to the wavenumber k, and f(y) and g(y) are the auxiliary sin/cosine integrals, related to the sin and cosine 
integrals Siy = i sinz dz/x and Ciy = lhe cosx dx/x by 
fly) = (7/2 — Siy) cosy + Ciysiny , (30.106a) 
gly) = (Tt/2 — Siy)siny — Ciy cosy . (30.106b) 
The mode enters the horizon y = 1 at a cosmic scale factor of a/deq = 4(1 + yx)/yZ- Figure 30.13 shows 


Oo — ® and —2® from equations (30.105) and (30.66), for a mode with k = ,/3/8 = 0.61, corresponding to 
yx = 2. This mode enters the horizon at a/deq = 3, at approximately the epoch of recombination. 


Concept question 30.13. Does the radiation monopole oscillate after recombination? Before 
recombination, photons and baryons are tightly coupled by electron scattering, and behave as a single fluid. 
After recombination, photons stream freely. Does the radiation monopole Og — ® keep oscillating after 
recombination, as in Figure 30.13, or does it stop oscillating, or does it do something else? Answer. The 
radiation monopole keeps oscillating, but differently. Two key differences in the free-streaming regime are, 
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firstly, that the effective sound speed increases to the speed of light, and secondly, that the oscillations 
damp adiabatically. See Exercise 32.7 for an approximate treatment of a relativistic fluid — neutrinos — in 
the free-streaming regime. A full treatment of radiation in the free-streaming regime requires the radiative 
transfer equation, §34.1. 


30.15 Matter-dominated regime 
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Figure 30.12 Matter-dominated regime. 


After matter-radiation equality, but before curvature or dark energy become important, non-relativistic 
matter dominates the mass-energy density of the Universe. 
In the matter-dominated epoch, the relevant equations are, from equations (30.53), (30.55), and (30.57), 


— kve =3®, (30.107a) 
Š BTF — kb = 4rGa? Dede , (30.107b) 
—kF =4nGa* peve , (30.107c) 


in which, because it simplifies the mathematics, the Einstein momentum equation is used as a substitute 
for the matter velocity equation. In the matter-dominated epoch, the horizon is proportional to the square 
root of the cosmic scale factor, n œ a!/?, equation (30.39). Inserting ôe and ve from the Einstein energy and 
momentum equations (30.107b) and (30.107c) into the matter density equation (30.107a) yields a second 
order differential equation for the potential ® 


x Ba 
b+ ÊO. (30.108) 
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Figure 30.13 Similar to Figure 30.7, but for a mode that enters the sound horizon kns = 1 during the matter-dominated 
regime, for adiabatic initial conditions. The mode shown has k = ,/3/8 = 0.61, which enters the horizon at a = 3, 
approximately the time of recombination, marked by a star. 


The general solution of equation (30.108) is a linear combination 
P = Corow grow + Caecay Pdecay (30.109) 
of growing and decaying solutions 
Parow =l, Ödecay =Y”, (30.110) 


where the dimensionless parameter y is, as previously, the wavenumber k multiplied by the sound horizon 
distance n/v3. In the matter-dominated regime y is, in units aeq = Heq = 1, 


kn 2 1/2 
y on = dy) ka ; (30.111) 


The constants Crow and Caecay in the solution (30.109) depend on conditions established before the matter- 
dominated epoch. The corresponding growing and decaying modes for the dark matter overdensity ô. are, 
from the Einstein energy equation (30.107b), 


(Se — 3®) grow = — (5 + $y”) Berow = — (5 + $k°a) Berow , (30.112a) 
(dc a 3® ) decay = —3y" Pdecay = — $k? a ®decay . (30.112b) 
The behaviour of the growing and decaying modes (30.112) agrees with both the subhorizon Meszaros 


solution (30.98) and the superhorizon solution (30.103) well after matter-radiation equality a >> 1, as they 
should. Any admixture of the decaying solution tends quickly to decay away, leaving the growing solution. 
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30.16 Baryons post-recombination 


Recombination frees baryons and photons from each other’s grasp. Starting at recombination, the freed 
baryons behaved as pressureless matter, like non-baryonic dark matter. In Exercise 30.14 you will figure out 
the behaviour of baryons and non-baryonic cold matter in the approximation that the Universe is matter- 
dominated. 


Exercise 30.14. Growth of baryon fluctuations after recombination. 

1. Growing and decaying modes. Assume that the Universe was matter-dominated at and after re- 
combination. What are the growing and decaying solutions for the matter fluctuations ôm? 

2. Green’s function for matter fluctuations. Find the Green’s function for any matter component 
subject to the initial conditions that the overdensity and its derivative with respect to cosmic scale 
factor a are dm(rec) and 6, (rec) at recombination. 

3. Initial conditions for dark matter and baryon fluctuations at recombination. What are ap- 
propriate initial conditions at recombination for fluctuations in each of the two matter components, 
non-baryonic dark matter and baryons? Consider separately small-scale modes that entered the hori- 
zon well before matter-radiation equality, and large-scale modes that entered the horizon well after 
matter-radiation equality, 

4. Growth of dark matter and baryon fluctuations. The matter density fluctuation is a sum of 
non-baryonic dark matter and baryonic contributions, 


Om = fede + fob > (30.113) 
where the constants fe and fp are the dark matter and baryon fractions 
fo eisi hoe. (30.114) 


Use the Green’s function with the chosen initial conditions to derive solutions for the dark matter and 
baryon overdensities 6, and dp after recombination. Sketch the solutions for the matter, dark matter, 
and baryon overdensities through recombination. 

5. Comment. A common statement is “Following recombination, baryons fall into the dark matter poten- 
tial wells.” Comment, in the light of your solutions. 


30.17 Matter with dark energy 


Some time after recombination, dark energy becomes important. Observational evidence suggests that the 
dominant energy-momentum component of the Universe today is dark energy, with an equation of state 
consistent with that of a cosmological constant, pa = —p,a. In what follows, dark energy is taken to have 
constant density, and therefore to be synonymous with a cosmological constant. Since dark energy has a 
constant energy density whereas matter density declines as a73, dark energy becomes important only well 
after recombination. 
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Dark energy does not cluster gravitationally, so the Einstein equations for the perturbed energy-momentum 
depend only on the matter fluctuation. However, dark energy does affect the evolution of the cosmic scale 
factor a. In fact, if matter is taken to be the only source of perturbation, then covariant energy-momentum 
conservation, as enforced by the Einstein equations, implies that the only addition that can be made to the un- 
perturbed background is dark energy, with constant energy density. To see this, consider the equations (30.53) 
governing the matter overdensity ôm and scalar velocity Vm (now subscripted m, since post-recombination 
matter includes baryons as well as non-baryonic cold dark matter), together with the Einstein energy and 
momentum equations (30.55) and (30.57) sourced only by matter, 


bm — kvm = 36, (30.115a) 
iin + vm = —k® , (30.115b) 
a 
~ 3° F — 0 = 4nGa2Amdm , (30.115¢) 
a 
—kF = 4nGa* PmVm - (30.115d) 


The factor 47Ga?f~ on the right hand side of the two Einstein equations can be written 


Sanh 


1 
Da ` (30.116) 


4rGa’ Pm = 
where ag and Hp are the present-day cosmic scale factor and Hubble parameter, and Qm is the present- 
day matter density (a constant). Allow the Hubble parameter H(a) = å/a? to be an arbitrary function 
of cosmic scale factor a. Inserting ôm and velocity vm from the Einstein energy and momentum equa- 
tions (30.115c) and (30.115d) into the matter equations (30.115a) and (30.115b), and taking the overdensity 
equation (30.115a) minus 3å/a times the velocity equation (30.115b), yields the condition 


dH? 
a* — +3 H2Z0m = 0 , (30.117) 
da 
whose solution is 
H? Q 
— = +Q 30.118 
Pur (A 


for some constant Q4. This shows that, as claimed, if only matter perturbations are present, then the unper- 
turbed background can contain, besides matter, only dark energy with constant density pa = HENA /($1G). 
The result is a consequence of the fact that the Einstein equations enforce covariant conservation of energy- 
momentum. 

With the Hubble parameter given by equation (30.118), the matter and Einstein equations (30.115) yield 
a second order differential equation for the potential ®, in units ag = 1: 


2a(Qm + a? Qa) 8” + (70m + 10a NA) + 6a NAP = 0. (30.119) 
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The growing and decaying solutions to equation (30.119) are, in units ao = 1, 


50 ..He H(a) [* dd 
Parow = sL ; 30.120 
j 2 a 1 a3 H (a')’ ( a 
H 
® decay = (30.120b) 
a 


The factor 30m HG in the growing solution is chosen so that ®grow —> 1 as a — 0. The growing solution ®grow 
can be expressed as an elliptic integral. The corresponding growing and decaying solutions for the matter 
overdensity ôm are, again in units ap = 1, 
2k?a 2k?a 
® 5, (dm —3® = -r 
30 H2 grow ( m Jdecay 30m H2 


(Sm — 3®) grow = Daccay - (30.121) 


For modes well inside the horizon, kn ~ ka!/?/Hp > 1, the relation (30.121) agrees with that (30.127) below. 


30.18 Matter with dark energy and curvature 


Curvature may also play a role after recombination. Since 2000, when the angular scale of the first peak in 
the CMB was resolved by the Boomerang balloon-based experiment (Bernardis et al., 2000), observational 
evidence has been stubbornly consistent with the Universe having zero curvature. But it is possible that 
there may be some small curvature. If the curvature is significantly non-zero today (larger than treatable in 
perturbation theory), then by definition the curvature scale is less than the horizon size today. Scales larger 
than the curvature scale should strictly be treated using an unperturbed FLRW metric with curvature. 
However, a flat background FLRW metric remains a good approximation for modes whose scales are small 
compared to the curvature. 


Concept question 30.15. Curvature scale. What is meant by the curvature scale? Is the curvature scale 
constant in comoving coordinates? 


For modes much smaller than the horizon distance today, the time derivative of the potential can be 
neglected compared to its spatial gradient, || < |k®|. If only matter, curvature, and dark energy are 
present, then only matter fluctuations contribute to the energy-momentum. At scales much less than the 
curvature scale, equations (30.115) then go over to the Newtonian limit, 


Ôm — kvm =0, (30.122a) 

Ym + vm = —k® , (30.122b) 
a 

—k?® = 4rGa? Pmôm - (30.122c) 


The factor 47Ga*» in the Einstein equation can be written as equation (30.116). The matter and Einstein 
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Figure 30.14 Contour plot of the growth factor g(a) in a universe containing matter, curvature, and a cosmological 
constant. If the Universe is flat, Q, = 0, then the Universe evolves from matter-dominated (Qm = 1, Qa = 0) to 
A-dominated (Qm = 0, Qa = 1) along the (blue) dashed line. 


equations (30.122) yield a second order equation for the matter overdensity ôm, in units ag = 1: 


P 1. 3QmH2 ôm 
Öm + be 0 
a 2 a 


Equation (30.123) can be recast as a differential equation with respect to cosmic scale factor a: 


i 4 (2 | 5) , 3QmH ôm 


=. (30.123) 


=0, 30.124 
H 2 aH? ( ) 
where H = å/a? is the Hubble parameter, and prime ’ denotes differentiation with respect to a. In the case 
of matter plus curvature plus dark energy, the Hubble parameter H satisfies, again in units ao = 1, 

H? 

He = Nma? + Nka’? + Qa $ (30.125) 
where Om, Qk, and Qa are the (constant) present-day values of the matter, curvature, and dark energy 
densities. The growing and decaying solutions to equation (30.124) are 


50mH? K da’ A 
dm,grow = agla) = Sa ua) | aP H (a? 5 dm, decay = Ho . (30.126) 
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The potential ® is related to the matter overdensity ôm by, again in units ag = 1, equation (30.122c), 
50H G Ôm 
2k? a` 


The observationally relevant solution is the growing mode. The growing mode is conventionally given a 


ð= (30.127) 


special notation, the growth factor g(a), because of its importance to relating the amplitude of clustering at 
various times, from recombination up to the present. For the growing mode, 


dxag(a), Pxg(a). (30.128) 


The normalization factor 30m Hé in equation (30.126) is chosen so that in the matter-dominated phase after 
recombination but before dark energy or curvature become important, the growth factor g(a) is unity, 


gla)=1 (arc &Ka<« 1). (30.129) 


Thus as long as the Universe remains matter-dominated, the potential ® remains constant. Curvature or 
dark energy causes the potential ® to decrease. Figure 30.14 illustrates the growth factor g(a) as a function 
of Qm and Qa. 

It should be emphasized that the growing and decaying solutions (30.126) are valid only for the case of 
matter plus curvature plus constant density dark energy, where the Hubble parameter takes the form (30.125). 
If another kind of mass-energy is considered, such as dark energy with non-constant density, then equations 
governing perturbations of the other kind must be adjoined, and the Einstein equations modified accordingly. 

The growth factor g(a) may expressed analytically as an elliptic function. A good analytic approximation 
is (Carroll, Press, and Turner, 1992) 


5Qmn 
g 
2 JON" — On + (1+ $m) (1+ 72a) 


(30.130) 


where Qg, are densities at the epoch being considered (such as the present, a = ao). 


30.19 Primordial power spectrum 


Initial conditions from inflation are conveniently characterized in terms of the gauge-invariant fluctuation 
¢ defined by equation (30.26), which has the property that it remains constant during evolution at super- 
horizon scales. The fluctuation ¢ is commonly called the primordial curvature fluctuation. According to the 
inflationary paradigm, fluctuations in Ç are generated by quantum fluctuations in the inflaton field that drives 
inflation. The amplitude ¢ of a mode freezes as the mode exits the horizon during inflation, and remains 
constant until the mode subsequently re-enters the horizon after inflation has ended. 

Generically, inflation predicts that primordial curvature fluctuations ¢ generated by vacuum fluctuations 
during inflation have a spectrum that is (1) Gaussian, and (2) scale-free. Inflation also predicts generically 
that the fluctuations are adiabatic, meaning that the curvature fluctuation is the same for all species, Çs = Ç 
for all species zx. 
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Gaussian distributions, §30.22.6, are ubiquitous in statistics as a consequence of the Central Limit Theorem 
(CLT), §30.22.5. The CLT states that the distribution of a random variable that is a sum of independent 
random increments is asymptotically Gaussian in the limit of a large number of increments. A Gaussian 
distribution is characterized entirely by its mean and variance, all higher irreducible moments vanishing. 

A scale-free spectrum of fluctuations is one in which the spatial variance €¢ of the dimensionless fluctuation 
¢ is the same on all scales, 


(¢(x")C(a)) = E¢(|x’ — x|) = constant , (30.131) 


independent of spatial separation |x’ —a|. A scale-free primordial spectrum of fluctuations was originally 

proposed as a natural initial condition by Harrison (1970) and Zeldovich (1972) before the idea of inflation 

was conceived. Inflation predicts a scale-free spectrum because the vacuum energy that drives inflation is 

constant in time, and quantum fluctuations in the vacuum remain statistically the same as time goes by. 

Thus the characteristic amplitude of fluctuations ¢ flying over the horizon remains the same as time goes by. 
The power spectrum P;(k) of fluctuations in Ç is defined by 


(C(k’)C(k)) = (27)35 p(k! + k)Pe(k) . (30.132) 


The “momentum-conserving” Dirac delta-function (27)?dp(k’ + k) in equation (30.132) is a consequence of 
the assumed statistical spatial translation symmetry of fluctuations in the spatially homogeneous FLRW 
background. The power spectrum P¢(k) is related to the correlation function €¢(x) by (with the standard 
convention in cosmology for the choice of signs and factors of 27) 


3 
Pik) = / errr (ide, Ele) = / e ** P.(k) ea . (30.133) 


Whereas the correlation function é¢(x) is dimensionless, the power spectrum P(k) has units of comoving 
length cubed. The scale-free character means that the dimensionless power spectrum AZ (k) defined by 


Ark? 
(27)8 


AÇ (k) = Pe(k) (30.134) 


is constant. 

Actually, the power spectrum generated by inflation is not precisely scale-free, because inflation comes to 
an end, which breaks scale-invariance. The departure from scale-invariance is conventionally characterized 
by a scalar spectral index, the tilt n, such that 


AÇ (k) x k"t. (30.135) 
Thus a scale-invariant power spectrum has 
n=1 (scale-invariant) . (30.136) 


Different inflationary models predict different tilts, mostly close to but slightly less than 1. 
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A common practice is to report the value of the dimensionless primordial power spectrum AZ (k) at some 
pivot scale kp, 


A2(k) = A? (kp) (=) - (30.137) 


The Planck collaboration (Aghanim et al., 2018) report 


A? (kp = 0.05 Mpe*) = (2.14 + 0.05) x 107°? , n = 0.965 + 0.004 . (30.138) 


The pivot scale kp was chosen in this case so that the error in the amplitude AZ (kp) was uncorrelated with 
the error in the tilt n. 


30.20 Matter power spectrum 


The matter power spectrum P,,(7,k) at time 7 is defined by 
(m(n, k')ôm(n, k)) = (20)?5D(k' + k)Pa(n, k) , (30.139) 


the Dirac delta-function being as before a consequence of the assumption of statistical spatial homogeneity. 
The assumption of statistical isotropy implies that the power spectrum Pm(n, k) is a function only of the 
magnitude k of the wavevector k. The matter power spectrum Pm(n, k) is related to the primordial power 
spectrum P¢(k) by 


_ 2 E 2 (27)? A» 
Pas (ms 8) = Ta, K)? Polk) = Tal, t)? E A0), (30.140) 
where T,,(7, k) is the matter transfer function defined by 
dm (1, k) 
Ta(n, k) = a - 30.141 
(nb) = S (30.141) 


The transfer function Tm (n, k) for any given cosmological model may be calculated by the methods expounded 
in the bulk of this Chapter, Exercise 30.16. 

The predictions of cosmological models of the matter power spectrum may be compared to measurements of 
the power spectrum of objects, such as galaxies, that may trace the matter distribution. Galaxy surveys probe 
the matter distribution well after recombination, and at scales much less than the horizon distance today. 
Under those circumstances, the matter transfer function Tm(n, k) factors into a product of three factors: (1) 
a factor relating the matter overdensity ôm to the potential ®, which in the Newtonian regime at subhorizon 
scales well after recombination is given in units a9 = 1 by equation (30.127); (2) a growth factor g(a), 
equation (30.126), relating the potential ®(7) at recent times 7 to the post-recombination matter-dominated 
potential ®(late); (3) a transfer function T(ate)(k) relating the matter-dominated potential (late) to the 
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primordial fluctuation Ç: 


Smh k) (n,k)  &(late, k) 


Tin > k) = Sik) llatak) C) 
2ak? 
=> (am) x g(a) x Tecate) (k) , (30.142) 
where 
late, k 
To(late) (k) = Set (30.143) 


The potential transfer function Tg(jate)(k) is independent of time ņ because the potential ®(late) is con- 
stant in the matter-dominated regime before dark energy (or curvature) becomes important. The factoriza- 
tion (30.142) of the matter transfer function Tm(n, k) separates the dependence on time 7 (or equivalently 
cosmic scale factor a) and wavenumber k. The first factor ôm/® is proportional to ak?; the second is a 
function g(a) only of cosmic scale factor a; and the third is a function Te(1ate) (k) only of wavenumber k. 

The factorization (30.142) of the matter transfer function T,, (7, k) implies that the matter power spectrum 
Pa(n, k), equation (30.140), is related to the primordial power spectrum P¢(n,k) by 


Pah en a AT Mri (| e "RE. ny? 27) eu 144 
mld ) ~ \ 39 H2 B(late) ( ) ¢ ( ) ~\3q H2 (late) ( ) An ral ) $ (30. ) 
mtt m-**0 


For a power-law primordial spectrum (30.135), the matter power spectrum at the largest scales, where the 
potential transfer function Tg (ate)(k) is a constant independent of k, goes as 


Pu(n,k) x k”. (30.145) 


The proportionality (30.145) explains the origin of the scalar index n. 


Exercise 30.16. Power spectrum of matter fluctuations: simple approximation. Use the code you 
wrote in Exercise 30.11 to compute the matter transfer function Tm(n, k), equation (30.141). Deduce the 
matter power spectrum Pm(no, k), equation (30.140), at the present time, 7 = 7. Use the normalization 
and tilt of primordial power measured from Planck, equation (30.138). Compute power spectra for a con- 
cordance ACDM model, Ëm = 0.3, Qa = 0.7, and a flat matter-dominated Universe, Qm = 1. Compare 
your matter power spectrum to data from Gil-Marin et al. (2020), downloadable from https: //svn.sdss.org/ 
public/data/eboss /DR16cosmo/tags/v1_0_1/dataveccov/Irg_elg_qso/LRG_Pk/. The best data sets are 
the “post-reconstruction” sets. The “reconstruction” involves undoing at least some of the effects of nonlinear 
evolution by moving galaxies around. Note the units of the data: wavenumber k in h Mpc~! and power P(k) 
in (h! Mpc)’, with h = Hy/(100kms~! Mpc7’). 

As in Exercise 30.11, you may find that your integration routine gets stuck trying to integrate the oscillating 
radiation monopole and dipole once the mode is well inside the horizon, kn >> 1. The strategy suggested 
in Exercise 30.11 was to modify the radiation dipole equation (30.54b) by introducing an artificial damping 
term, equation (30.59), that damps radiation once it is well inside the horizon. Since the radiation fluctuation 
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ceases to influence the gravitational potential or the matter fluctuation once the radiation has oscillated many 
times, the artificial damping has little effect on the model power spectrum. 

A second problem you will encounter is that of power at superhorizon scales. Astronomers on Earth cannot 
measure power at scales larger than our horizon because they cannot distinguish a superhorizon fluctuation 
from a change in the mean density of the background FLRW geometry. To eliminate the unmeasurable 
superhorizon power, calculate power from the overdensity ô% — 69 with a large-scale constant 69 subtracted. 

A third problem is that galaxies do not necessarily trace the distribution of matter. A simple model is to 
suppose a linear relation between galaxy overdensity J, and matter overdensity m (in Fourier space), 


5g = bôm , (30.146) 


where b is the bias parameter. Linear bias was introduced by Kaiser (1984), who showed that regions of a 
Gaussian field (§30.22.3) above a high threshold density are linearly biassed. 

Solution. See Figure 30.15. One of the trickier issues is getting the units right. The SDSSIV data are given 
in units where the length scale is such that the comoving Hubble distance at the present time is 


c  299,792.458 km s~! 
aoHo  100hkms-! Mpc 


= 2,997.92458 h7! Mpc™! . (30.147) 


My code worked in units where c = aeq = Heq = 1. With Q, representing values at the present time, the 
Hubble parameter now Ho and at matter-radiation equality Heq are related by 


He 
i = 1 %+(0q/40)~4 + mm (eq /40)-3 + Mx (aeq/a0)-? + Dp . (30.148) 
0 
I chose ag/Geq = 3400, and present-day densities of Qm = 0.29, Q, = Qm/3400, Qk = 0, Qa = 1-O,-OQm—-Qa. 
The code gave Ho/Heq = 6.4 x 10~®, and so 


c 1 
agHy 3400 x (6.4 x 1076) 


= 46.0 program units . (30.149) 


The conversion factor between h~! Mpc and program length units was therefore 


2,997.92458 h71 Mpc 
46.0 


1 program length unit = = 65.1h~' Mpc. (30.150) 


For the wavenumber, this meant that the conversion between h Mpc‘ and program units was 


k 
ki Mpe-? = =: 30.151 
hMpe™ ~ 65.1h-! Mpc ( ) 
The model power spectrum in Figure 30.15 has been multiplied, arbitrarily, by a squared bias factor of 
b? = 1.052, to give a better fit to the observed power spectrum. The residual difference between observed and 
model power shows wiggles. These are baryon acoustic oscillations (BAO), the presence of which is predicted 
when baryons are included, Figure 32.4. 
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Figure 30.15 Model matter power spectra computed in the simple approximation, compared to observations from the 
North (N) and South (5) Galactic Caps of the the Sloan Digital Sky Survey IV (Gil-Marin et al., 2020). The data 
comprise 377,458 luminous red galaxies covering approximately 18% of the sky over redshifts z = 0.6-1. Two models 
are shown, a flat ACDM model with concordance parameters Q4 = 0.69 and Qm = 0.31, and a flat matter-only 
CDM model, Qm = 1. The dashed lines on the models show power calculated from \62|, which includes unmeasurable 
superhorizon power; the solid lines are calculated from |(5, — 69)2|, which excludes the unmeasurable superhorizon 
power by subtracting a constant ôo from the overdensity. The ACDM model is normalized to the amplitude (30.138) 
measured by Planck (Aghanim et al., 2018), multiplied by a bias squared factor of b? = 1.057. The ACDM power 
spectrum calculated here in the simple approximation may be compared to the corresponding power spectra in the 
hydrodynamic approximation, Figure 32.4, and from a Boltzmann computation, Figure 33.5. 


30.21 Nonlinear evolution of the matter power spectrum 


This Chapter has assumed throughout that linear perturbation theory holds, which requires that fluctuations 
be small, 6 < 1. This assumption fails for matter fluctuations at small scales, which in due course collapse into 
galaxies, with matter densities much greater than the mean, ôm >> 1. Evolution in this regime is nonlinear, 
and must usually be followed with large computer simulations. 

Since gravity remains weak, ® < 1 and matter moves non-relativistically even in the nonlinear regime, 
gravity remains well described by the Newtonian limit, equation (30.122c). The equations of conservation 
of mass and momentum still hold for the matter, but these equations are no longer linear. To the extent 
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that the matter streams collisionlessly, as is the case for nonbaryonic dark matter, its nonlinear evolution is 
straightforward, if computationally intensive, to follow. However, the collisional dynamics of baryons leads 
to interesting and complicated phenomena, including stars, planets, and black holes, and people to worry 
about them. 


30.22 Statistics of random fields 


30.22.1 Random field 


A basic proposition of modern cosmology, so far well-supported by observational evidence, is that fluctuations 
in the Universe originated from some random process that operated in the same fashion from place to place. 
In the inflationary paradigm, fluctuations originated as quantum fluctuations in the inflaton field that drove 
inflation. According to this proposition, the fluctuating density p(x) of any measurable quantity (such as 
matter density, or radiation temperature) in our Universe constitutes a random field. The density p(a) at 
a randomly chosen position æ constitutes a random variable with some probability distribution P(p) of 
finding the density to lie in an interval dp. By definition, the probability distribution P() is positive, and 
normalized to unit total probability, 


[ro dp=1. (30.152) 


In a random field, the densities p(a1) and p(a2) at two different points are in general not independent, 
so the 1-point probability (30.152) is not sufficient to determine completely the statistical properties of the 
field. For example, since gravity causes matter to cluster, the densities at two nearby points are correlated, 
not independent. For brevity, denote the density at spatial position x; by pi, 


pi = p(xi) . (30.153) 


The properties of the random field p(x) are determined by an infinite set of N-point probability distribu- 
tions P(p1,..., Nn) of finding the densities p; at N positions a; to lie in an interval dp1...don. By definition, 
the joint N-point probability distribution is positive, and normalized to unit total probability, 


[Po > PN) dp,..dpn =1. (30.154) 


By homogeneity, the N-point probability is a function only of the relative spatial positions x;, not of their 
absolute positions. 

The limitations of observational accessibility and accuracy mean that the true N-point probability distri- 
butions P(p1,...,9n) are not known exactly. It is then necessary to make hypotheses about the form of the 
probability, and to test those hypotheses against the available sampling of data. 
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30.22.2 Random fields in Fourier space 


Any linear combination p; of random fields p; is a random field, 
Pi = X aig; ; (30.155) 
J 


where a;j are constants, and the sum over j could represent an integral over a continuum. In particular, the 
Fourier transform p(k) of a random field p(x), 
dk 


p(k) = [owe dr ; p(a) = foie ESE 


ame? (30.156) 


is a random field. 

The Fourier modes p(k) of a random field are of special importance when the field is statistically homo- 
geneous, because Fourier modes are eigenmodes of the translation operator V, and the statistical properties 
of a statistically homogeneous random field commute with the translation operator. 


30.22.3 Gaussian random fields 


A generic prediction of inflation is that the primordial distribution of fluctuations was Gaussian, as a result 
of their origin as quantum fluctuations. Whenever the values p(az) at each point x of a random field are 
generated as a sum of a large number of independent random increments, then the resulting field will be 
Gaussian, as a consequence of the Central Limit Theorem. The CLT is proved for the simple case of a single 
random variable p in §30.22.5. 

A Gaussian random field p(a) is defined by the vanishing of all irreducible moments other than the first 
two, the mean J, and the variance C4,;, 


Ciz = (Ap; Ap;) , (30.157) 


where Ap; = pi — P is the deviation of p; from the mean. The mean J is a single number. The assumption 
of statistical homogeneity and isotropy implies that the variance is a function Ci; = C(2;,;) only of the 
separation £;j = |x; — x,;| of the points. The covariance Cj; defined by equation (30.157) has dimensions of 
p. Commonly, a dimensionless version €;; of the covariance is defined by dividing by p. 

The N-point probability distribution of a Gaussian random field is, generalizing the 1-point probabil- 
ity (30.176) derived below, 


1 = 
P(p1,..-, pn) dpi..-dpn = F exp ( 5C;; Api Ap;) dpi..-dpn , (30.158) 
(27)%|Cig| 
where |C;;| is the determinant of the covariance matrix. 
Any linear combination ` j@ijpj Of Gaussian random fields pj is also a Gaussian random field. In partic- 
ular, the Fourier transform p(k) of a Gaussian random field p(a) is a Gaussian random field. 
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30.22.4 Moment-generating functions 


The proof of the Central Limit Theorem, §30.22.5, goes via moment-generating functions. For simplicity, 
moment-generating functions are defined in this section for a single random variable p, but the results 
generalize straightforwardly to a random field p(a). The validity of the steps below requires that various 
integrals over the probability distribution P(p) converge; any required convergence properties are tacitly 
assumed. 

The random variable p has a positive probability distribution P(p) normalized to unit total probability. 
The moment-generating function of the probability distribution P(p) is defined to be 


M(p)= ferro dp . (30.159) 


Expanding the exponential in the integrand as a power series in u implies that the moment-generating 


function is 


2 3 
M(u) = 1+ (ou + (p?) + WE +, (30.160) 


where (p”) is the mth moment of the probability distribution, 


(0) = f oP) ap. (30.161) 


Equation (30.160) accounts for the name moment-generating function. 

Suppose that the measurement of pis repeated N times, and suppose that each measurement is independent 
of the others, meaning that the probability of measuring successive values pà(1), ---, Pc) is the product of 
probabilities (the subscripts are parenthesized to distinguish the 7’th observation pq) from the 7’th position 


pi) 
P(p(1), + P(N) = P(pq1y) P(e) - (30.162) 


The moment-generating function My (u) of the sum es pci) of N independent measurements pq) is then 
the N’th power of the moment-generating function M (u), 


My(u) = J ETAP loa pan) doa-doa 
= [ete Poa) dey f et” Plog) doo 


= M(H". (30.163) 


Thus the moment-generating function of a sum of independent measurements is multiplicative. The irreducible- 
moment-generating function Z(u) is defined to be the logarithm of the moment-generating function, 


Z(u) = In[M(p)] . (30.164) 


In statistical mechanics, the irreducible-moment-generating function Z() is called the partition function. 
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Since the moment-generating function is multiplicative, the irreducible-moment-generating function Zy (p) 
N ; ‘ “3 
of a sum J`; pci) of N independent measurements pq) is additive, 


Zy (pw) =NZ(p) - (30.165) 


The coefficients of the series expansion of Z(u) in p define the irreducible moments Kn, 


2 3 

Z(p) = uki + oe ie aioe ee (30.166) 
Unlike the moments (p”), the irreducible moments «Kn have the important property of being additive over 
sums Ei pii) Of independent variables. The defining relation (30.164) between the irreducible Z(y) and 
standard M(u) moment-generating functions yields the relation between the irreducible moments «kn and 
moments (p"). The relations for the first few moments are, with Ap = p — P, 


Ki =p; (30.167a) 
k2 = (Ap) (30.167b) 
k3 = (Ap) (30.167c) 
k4 = (Ap*) — 3{Ap*\* . (30.167d) 


The low order irreducible moments have names: the first, second, third, and fourth irreducible moments are 
called respectively the mean, variance, skewness, and kurtosis. Some works define skewness and kurtosis as 
the dimensionless combinations k3 / K2” and K4/k3. 

More generally, the irreducible-moment-generating function Z(u;) of a random field p(a) is 


LAr eT ee Se Kad AEAEE kag bees (30.168) 


where k,n = K(@1,...,%,) is the n-point irreducible moment, also called the n-point correlation function. 
The first few correlation functions «1, are related to the moments (Ap,...Ap,,) of the distribution by 


Ki =P, (30.169a) 
Ki2 = (Api Ape) , (30.169b) 
K123 = (Api Apo Aps) , (30.169c) 
k1234 = (Api Ap2 Aps Apa) — (Api Ap2)(Aps Apa) — (Api Aps)(Ap2 Apa) — (Api Apa) (Ape a . 
30.169d 


30.22.5 Central Limit Theorem 


The Central Limit Theorem (CLT) states that the distribution of averages of N independent measurements 
of a random variable is Gaussian in the limit of large N. The CLT generalizes to a random field, but for 
simplicity this section confines itself to the case of a single random variable p. 
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As shown in §30.22.4, irreducible moments are additive over sums of independent random variables. Thus 
the irreducible moment «,, of a sum DaM pai) of N independent variables p(;) goes as 


kn oN. (30.170) 


The wth irreducible moment Kn has units of p”. The shape of the probability distribution P(p) can be 
characterized by dimensionless combinations of the irreducible moments. For example, the standard deviation 
g, defined to be the square root of the variance, o = \/K2 = \/(Ap?), has the dimension of p. The standard 
deviation increases with the number N of independent measurements as VN, but the dimensionless ratio 
o/p of the standard deviation to the mean decreases as 1/ VN, 


Oo a/K2 1 
o= kx VN, == x ; 30.171 
á p mn VN ( ) 


This recovers the familiar result that the difference between the average N~! Y; pai) Of a set of independent 
measurements and the true mean J decreases as 1/N as the number N of measurements increases. 

The shape of the probability distribution beyond its first and second irreducible moments can be char- 
acterized by the dimensionless ratio Kant” SR of the n’th to 2nd irreducible moments. This ratio becomes 
small as the number N of independent measurements increases, 


1/n 
Kn 


1/2 
Ko 


x NV"-1/2_.9 as N> forn>3. (30.172) 


The asymptotic behaviour (30.172) is the CLT: it says that higher order irreducible moments become negli- 
gible in the limit of large N. 


30.22.6 Gaussian distribution 


A Gaussian distribution is defined by the property that its only non-vanishing irreducible moments are the 
first two, the mean « and variance K2. The third and higher irreducible moments of a Gaussian distribution 
vanish, 


kn =0 (n>3) Gaussian distribution . (30.173) 


The irreducible-moment-generating function Z (u) of a Gaussian distribution is, from equation (30.166), 


Z) = upt EAP) (30.174) 


Accordingly, the moment-generating function M (u) of a Gaussian is 


2 


M(u) = exp (up + Etap) ' (30.175) 
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The probability distribution P(p) that, when integrated in accordance with the definition (30.159), yields 
the Gaussian moment-generating function (30.175), is 


P(p)dp = (p= a dp . (30.176) 


27 (Ap?) Já | 2(Ap?) 


The 1-point Gaussian probability distribution (30.176) generalizes to the N-point Gaussian probability dis- 
tribution (30.158). 
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Non-equilibrium processes in the FLRW 
background 


The subject of cosmological perturbations will be resumed in the next Chapter 32. The present Chapter is 
concerned principally with an essential ingredient in the calculation of the power spectrum of CMB fluctua- 
tions, namely recombination in the unperturbed FLRW background. Recombination presents an opportunity 
to introduce the collisional Boltzmann equation, §31.5, which allows to follow the evolution of number den- 
sities of species out of thermodynamic equilibrium, and which will be invoked again in Chapter 33 to follow 
the evolution of the photon distribution of the CMB. 

In the early Universe, density and temperature were high enough that collisional processes were fast 
enough to drive particles into mutual thermodynamic equilibrium. But as the Universe expanded, density and 
temperature decreased to the point that some processes fell out of equilibrium and froze out. Recombination, 
and its inverse photoionization, constitute one example of such a process. At times well before the epoch 
of recombination, the two-body process of recombination and its inverse process photoionization drove the 
ionization state of the gas into thermodynamic equilibrium. But as recombination approached, recombination 
rates could no longer keep up, slightly delaying the epoch of recombination, and leaving a residual level of 
ionization. The residual ionization later catalyzed the formation of molecular hydrogen, leading to the first 
generation of stars. 

Besides recombination, there are some other processes of freeze-out in the expanding Universe that are 
associated with well-understood physics. (1) The weak interactions froze out after electron-positron anni- 
hilation, so that protons and neutrons could no longer interconvert, causing the neutron-to-proton ratio to 
freeze out. The frozen neutron-to-proton ratio subsequently determined the primordial abundance of helium 
to hydrogen. (2) Nuclear reactions froze out, causing primordial nucleosynthesis to cease at the light elements 
H, D (= 7H), °He, “He, and Li, rather than proceeding all the way to the most tightly bound nucleus, iron. 
This is well and good, since if nucleosynthesis had proceeded to completion, there would be no stars, and no 
people. 

Yet other processes of freeze-out probably occurred, but their physics is poorly understood, so only guesses 
and estimates can be made. (1) Our Universe shows an excess of matter (protons, neutrons, electrons) over 
antimatter (antiprotons, antineutrons, positrons). For this asymmetry to occur, there must have been some 
T-violating process that preferred the creation of matter over antimatter, and that process must have frozen 
out. (2) A leading candidate for the non-baryonic cold dark matter is a weakly-interacting massive particle 


832 
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(WIMP). In order that the mass density of WIMPs be as observed today, their number density must be 
much less than that of relativistic particles (photons). If WIMPs were initially in thermodynamic equilibrium 
at some relativistic temperature, then the WIMPs must have annihilated with their antiparticles as they 
became non-relativistic; moreover that annihilation must have frozen-out so as to leave the remnant density 
observed today. To achieve this outcome, the WIMP annihilation cross-section must be comparable to a 
weak-interaction cross-section, which explains the popularity of the WIMP proposal. As of writing (2015), 
laboratory attempts to detect WIMPs experimentally have led only to upper limits. 


31.1 Conditions around the epoch of recombination 


Two key quantities around the time of recombination were the photon temperature T and the baryon number 
density np. Because the baryon-to-photon ratio n,/n+~ 107°, equation (10.103), was so small, the photon 
distribution was essentially unaffected by the baryons. Photons remained in thermodynamic equilibrium at 
a temperature T that evolved with cosmic scale factor a (normalized to ao = 1) as 


(31.1) 


where To = 2.725 K is the CMB temperature today. Equation (31.1) held from after electron-positron anni- 
hilation at T ~ 1 MeV down to the present time. The baryon number density np was (again normalized to 
ag = 1) 

3 Hê 


Sa 1.2 
87rGmpae ’ (31.2) 


Nb 


where mp, = 939 MeV was the approximate mean mass per baryon. 
The electron fraction Xe may be defined to be the ratio of the electron density ne to the nuclear proton 
density n+, including all protons in all nuclei, 
n 
X,=—. (31.3) 
n4 
The definition (31.3) is chosen so that Xe = 1 when the plasma is fully ionized. The nuclear proton density 
Np is 
n+ = finn 5 (31.4) 


where f} = n4/ny is the proton fraction. To a good approximation, baryons comprised H and “He nuclei, 
and f+ = 0.875, Exercise 31.1. 


Exercise 31.1. Proton and neutron fractions. Define the proton and neutron fractions f} and fn by 
the proton- and neutron-to-baryon ratios 


f= Sate «fi 


Np ny 


(31.5) 
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Here n} and np are the number densities of protons and neutrons in all nuclei. The baryon number density 
is their sum np = n4 + nn. For a H plus “He composition, the nuclear proton and neutron number densities 
are 


Ne =Nyt2naye, Nn = 2Nape . (31.6) 
Show that the primordial “He mass fraction defined by Yayo = p4He/(PH + pine) satisfies 


Yage = 2fn - (31.7) 


The observed primordial “He abundance is Yay. = 0.245 + 0.004 (Cyburt et al., 2016), implying 


fa = 0.1225, fp =1— fa = 0.8775 . (31.8) 


31.2 Overview of recombination 


The classic paper on cosmological recombination is Peebles (1968). 

The ionization state of the Universe around the time of recombination was determined largely by hy- 
drogen, the most abundant element. Recombination of hydrogen is a two-body process whose inverse is 
photoionization, 


recombination 


pte g Hye (31.9) 
photoionization 
Helium, the next most abundant element, was largely neutral by the time of recombination; its effect on 
recombination was quite small. 

At times well before recombination, the ionization state of the baryonic gas was close to thermodynamic 
equilibrium. At the temperatures of relevance, electrons and nuclei were non-relativistic, and their occupa- 
tion numbers f, given in thermodynamic equilibrium by equations (10.124), were much less than 1. The 
occupation numbers were small in part because the asymmetry between matter (protons, neutrons, elec- 
trons) and antimatter (antiprotons, antineutrons, positrons) is quite small, about 107° baryons per CMB 
photon, equation (10.103). Early in the Universe when the temperature exceeded their rest-mass energy, 
particles and antiparticles in thermodynamic equilibrium had number densities comparable to photons (with 
a factor of # in the number density of fermions relative to bosons, equation (10.140)). Because of the small 
matter-antimatter asymmetry, the number density of particles and antiparticles were almost equal, so their 
chemical potentials were almost zero, Exercise 10.17. Relativistic fermions in thermodynamic equilibrium 
had occupation numbers of order unity for energies less than of order the temperature, f = 1/(e"/7 +1) ~ 1 
for E < T. As matter particles annihilated with their antiparticles, their occupation number fell to ~ 107°, 
Figure 10.16. As the Universe continued to expand, the occupation number of the now non-relativistic parti- 
cles, still in thermodynamic equilibrium with photons, fell further as f ~ nT~°/? œ T?/?, equation (31.15). 
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Thus the occupation numbers of non-relativistic electrons and nuclei was 
p\3/2 
f~10° (=) <1 (31.10) 
m 


for particle kinetic energies p?/(2m) less than of order the temperature T. 

Because of the low occupation number, hydrogen remained ionized down to a much lower temperature, 
T ~ 0.3eV ~ 3,000 K, than the ionization energy 13.6 eV of hydrogen. 

The temperature T ~ 0.3eV of recombination was much lower than the difference FE, — Eo ~ 10.2eV 
between the ground n = 1 and first excited n = 2 energy levels of hydrogen. Consequently the Boltzmann 
factor strongly favoured the ground state, so that near recombination almost all the hydrogen atoms were 
in their ground states, equation (31.20). The recombination temperature T ~ 0.3eV was also significantly 
lower than the difference Ey — E3 ~ 1.9eV between first n = 2 and second n = 3 excited energy levels of 
hydrogen, so the population of n = 2 substantially outnumbered higher excited states, equation (31.20). To 
a good approximation, recombination involved only the first two energy levels n = 1 and 2 of hydrogen. 

As the density and temperature decreased because of adiabatic expansion, recombination could no longer 
keep up. The large density of hydrogen atoms in the ground state meant that Lyman transitions, transi- 
tions between the ground state and other states, were optically thick. Any radiative decay to the ground 
state produced a Lyman line or continuum photon that was quickly absorbed by a nearby hydrogen atom. 
Recombination to the ground state was inhibited. The bottleneck caused the n = 2 energy level to become 
overpopulated relative to the ground state, compared to thermodynamic equilibrium. 

Recombination nevertheless proceeded via two slow processes, one from the 2p level, the other from the 2s 
level of hydrogen. The first process is that, as the Universe expands, the Lyman a 2p—1s transition redshifts, 
and there is a finite probability for the photon to redshift out of the line without being absorbed. The second 
process is that the 2s level can decay by a forbidden 2-photon transition. A possible third process, collisional 
deexcitation of excited levels to the ground state, was slower than either of the first two. 


31.3 Energy levels and ionization state in thermodynamic equilibrium 


Electrons and nuclei near recombination were non-relativistic, and their occupation numbers were small, and 
therefore well described by Boltzmann statistics, with occupation number f given by equation (10.125). 


31.3.1 Number density of non-relativistic Boltzmann species in thermodynamic 
equilibrium 
The energy E of a non-relativistic particle of mass m is related to its momentum p by E = m + p?/(2m). 


For a hydrogen atom in energy level n, the rest mass m is less than the rest mass m, of a proton by the 
binding energy En of the atom, m = mp — En. In thermodynamic equilibrium, the number density n of a 
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non-relativistic Boltzmann species is 


gAnp*dp _ (u-m)/T fe -p?/(2m7) GARD" dp 

= ae 31.11 
n= fe one (27h)? ( ) 

The integral on the right hand side of equation (31.11) is 

3/2 
p2 / (omr) J 4p? dp _ mT 
fe p?/(2mT) mr = 9 (= , (31.12) 
so the number density in thermodynamic equilibrium is 

p \3/2 

n=g (Z5) elu-m)/T (31.13) 


The factor (mT /(2mh2))°/ defines a length scale Ap which is a characteristic thermal Compton wavelength 
of the particles, 


—1/2 
mT 
Ar = (=) i (31.14) 
In terms of their number density n, the occupation number f = e#-F)/T of a Boltzmann species is 
—3/2 3 
ao ( oa) e-P?/(2mT) _ MAT .—p?/(amT) | (31.15) 
2rh g 


The condition for the validity of the Boltzmann approximation of small occupation numbers is that there be 
few particles per Compton volume, nà} « 1. 


31.3.2 Level populations of hydrogen in thermodynamic equilibrium 


Bound eigenstates of hydrogen are characterized by quantum numbers n, l, and m associated with their 
energy, total angular momentum, and projection of the angular momentum along an arbitrary direction. 
Ignoring the small corrections to energy levels arising from relativistic and spin effects, the energies of the 
bound eigenstates of hydrogen are 


— Ep =-13.6eV/n? , (31.16) 


with n = 1, ..., o0 an integer running from the ground state 1 to the continuum oo. Within each energy level 
n, the total angular momentum / runs over n integers l = 0, ...,n — 1. Within each angular momentum level 
l the “magnetic” quantum number m runs over 2l + 1 integers m = —l,...,l. Altogether, each hydrogenic 
energy level n contains An’ individual states, comprising 2 spin states of the nuclear proton, 2 spin states of 
the electron, and 7/5 "(21 + 1) = n? states of orbital angular momentum. 

In thermodynamic equilibrium, the number density ny; in level nl of hydrogen relative to the number 
density m1, in the ground level 1s is, from equation (31.13), 


Nnl 


a = (2+ 1je Er 7ED/T (31.17) 
1s 
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Figure 31.1 Hydrogen and helium ion fractions in thermodynamic equilibrium as a function of cosmic scale factor a 
scaled to ag = 1. The total hydrogen and helium fractions are Xy = 1 — fn/f+ = 0.86 and Xie = i fn/f+ = 0.07 
where fp = 1 — fn = 0.875 and fn = nn/np = 0.125 are the neutron- and proton-to-baryon ratios, Exercise 31.1. 
The dashed vertical line indicates where recombination actually occurs (where the Thomson scattering optical depth 
is unity), somewhat later than predicted by equilibrium. 


31.3.3 Ionization state in thermodynamic equilibrium 


In thermodynamic equilibrium, the chemical potentials of protons, electrons, and neutral hydrogen atoms 
are related by up + He = un, equation (10.127). Inserting this equilibrium condition into equation (31.13), 
valid for non-relativistic Boltzmann species, implies the relation between the number densities np, ne, and 
Nnı Of protons, electrons, and hydrogen atoms in level nl, 


Npn InGe [ MeT 3/2 / 
plte pje e —En/T 
= apa 31.18 
Nnl Jnl (z) E ( ) 


Equation (31.18) is the Saha equation for hydrogen. The me on the right hand side of equation (31.18) 
is strictly MpMe/Mnpı where Mp; is the mass of the hydrogen atom in level nl, but mp % Mp; to a good 
approximation. 

More generally, the Saha equation relating the number densities of an ion X to the next-ionized ion X* is 


(31.19) 


NX+Ne _ JXtIe o a Ex/T 
nx gx l 


Figure 31.1 illustrates the ionization fractions of H and 4He in thermodynamic equilibrium at the photon 
temperature T and baryon density n» given by equations (31.1) and (31.2). 
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Exercise 31.2. Level populations of hydrogen near recombination. Use the approximation of ther- 
modynamic equilibrium to estimate the relative number densities of states of hydrogen near recombination, 
where T ~ 0.3eV. 

Solution. From equation (31.17), the ratio of excited n = 2 to ground n = 1 levels in thermodynamic 
equilibrium is 


(4-1) 13.6eV/0.3eV 


Np = S N2s ~ 3e is ~ LO nis ; (31.20) 


which is tiny. The equilibrium ratio depends steeply on the temperature, which is one reason why recombi- 
nation cannot keep up as the temperature falls. Similarly, the ratio of the population of the second n = 3 to 
first n = 2 excited states is 


9 (3-1) 13.6eV/0.3eV 


nz ~ ĝe 2m 4x 10° no, (31.21) 


which is also small. Thus the ground state n = 1 dominates the level population, followed by the first excited 
states n = 2, 


ny > n2 È Nn>3 - (31.22) 


Exercise 31.3. Ionization state of hydrogen near recombination. Use the approximation of thermo- 
dynamic equilibrium to estimate the temperature at which hydrogen recombines. 

Solution. Almost all the hydrogen atoms are in their ground states. In the approximation that all hydrogen 
atoms are in their ground state 1s, the Saha equation (31.18) implies 


3/2 
NpNe MeT -E/T 
= 1.2 
Nis (z5) i eles) 


the statistical weight factor cancelling, gis; = gpge = 4. In the approximation of a pure hydrogen gas, in 
which case the nuclear proton density equals the baryon density, ny = np, the Saha equation (31.23) is 


3/2 
X2 1 (mr) / o- /T 23/2Gm,T3 ew e- B1/T 


1-X. nz (He ~ 3nt/20,H2 \T 


, (31.24) 


where Xe is the electron fraction, equation (31.3), and np the baryon density, equation (31.2). Recombination 
occurs at Xe = $. Equation (31.24) is then an implicit equation for the temperature T. It can be solved 
iteratively by guessing an initial T, and calculating an improved value from 


E 23/2 T8 / mey 3/2 
A jy i ees (Z ) (31.25) 
T 31 20,He \T 
Guessing T = 104 K yields 
Ey 
— x40 31.26 
T ’ ( ) 


which gives the estimated recombination temperature of T ~ 4,000 K. Iterating a second time gives 


T © 3,800K . (31.27) 
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Concept question 31.4. Atomic structure notation. An eigenstate with | = 0 is denoted s, while one 
with | = 1 is denoted p. Why? Answer. For historical reasons. In atomic spectroscopy, angular momen- 
tum levels | = 0,1,2,3,4,... are conventionally denoted s,p,d, f,g,..., the first 4 letters standing for sharp, 
principal, diffuse, and fundamental. After fundamental f, the labelling is alphabetical. 


31.4 Occupation numbers 


Occupation number was discussed previously in §10.26. 

Each species of energy-momentum is described by a dimensionless occupation number, or phase-space 
probability distribution, a function f(t,x,p) of time t, comoving position x, and tetrad-frame momentum 
p, which describes the number dN of particles in a tetrad-frame element. d?r d3p/(27h)? of phase-space, 


dN (t,x, p) = f(t, x, p) ss (31.28) 
T 
with g being the number of spin states of the particle. The tetrad-frame phase-space element d?r d?p/(2rh)’ 
is dimensionless and Lorentz-invariant, and the occupation number f is likewise dimensionless and Lorentz- 
invariant. The tetrad-frame energy-momentum 4-vector p™ of a particle is 
m — m dx” a 

p” = e", y = {Ep} = {Ep} , (31.29) 
where A is the affine parameter, related to proper time 7 along the worldline of the particle by dA = 
dr/m, which remains well-defined in the limit of massless particles, m = 0. The tetrad-frame energy E and 
momentum p = |p| for a particle of rest mass m are related by 


E? -p =m. (31.30) 


31.5 Boltzmann equation 


The detailed evolution of the abundance of any species can be followed using the Boltzmann equation. 
The Boltzmann equation splits the evolution of the occupation number f of a species into a collisionless part 
in which each particle evolves as a test particle in the background geometry, and a collisional part in which 
particles are destroyed or created as a result of collisions with other particles. 

Collisionless evolution is described by the single-particle distribution function, the occupation number f. 
Because phase-space volume is conserved as the system evolves, §4.22.1, conservation of particle number 
along the paths of particles, dN/dA = 0, is equivalent to conservation of the occupation number f defined 
by equation (31.28), 

df 
on 


Equation (31.31) is the collisionless Boltzmann equation. The derivative with respect to affine parameter 


0. (31.31) 
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A on the left hand side of the Boltzmann equation (31.31) is a Lagrangian derivative along the (timelike or 
lightlike) worldline of a particle in the fluid. 

The collisionless Boltzmann equation holds without modification for particles that do not collide, such as 
neutrinos or non-baryonic dark matter particles, but it fails for particles whose trajectories are substantially 
modified by collisions with other particles, such as photons or baryons. Collisions are both a sink and a source 
of particles, destroying particles of momentum p and creating others of momentum p’ in the single-particle 
distribution f. The effect of collisions is modelled by introducing a collision term, schematically written 
C|f], containing both sinks and sources, 


df 
—=Clf]. 31.32 
<= Ch (31.32) 
Equation (31.32) is the collisional Boltzmann equation. Since f is dimensionless while the affine param- 
eter dà = dr/m has units of time/mass, the units of the collision term C[f] are mass/time. 


31.5.1 Boltzmann equation in the FLRW geometry 


In the FLRW geometry, homogeneity and isotropy imply that the occupation number is a function f(t, p) only 
of cosmic time t and of the magnitude p of the proper momentum. The collisional Boltzmann equation (31.32) 
is then 


df — dt Of , dpof _ 
dd dd Ot ` d\Op 


Clf] . (31.33) 


To follow lots of particles simultaneously, switch the integration variable from the affine parameter A, which 
is particle-dependent, to cosmic time t, which is the same for all. With cosmic time t as the integration 
variable, the only non-vanishing vierbein coefficient that depends on t in the background FLRW geometry 
is eg’ = 1. The relation between cosmic time t and affine parameter A is 


=a PP = ĉo P =E, (31.34) 


where E = p° is the proper energy of the particle in the tetrad rest-frame. It would be equally possible to 
use conformal time 7 as the integration variable, as will be done later in §33.2, in which case ep” = 1/a 
and dn/dà = E/a; for the present purpose however, cosmic time t is slightly more convenient. As found in 
Exercise 10.5, the proper momentum of a particle, massless or massive, redshifts as p x 1/a, so dln p/dt = 
—dlna/dt. Thus the Boltzmann equation (31.33) is 


df Of dma ðf _1 


d OF aE Dap E (31.35) 
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The proper number density n is an integral (10.120) of the occupation number f over momenta. Integrating 
the left hand side of the Boltzmann equation (31.35) gives 


df g4np*dp _ [fg 4rpdp dina Of gAnp?dp 
dt (27h)? J ôt (27h)3 dt Olnp (27h)? 
a g4rpdp dina f g4rp? E 
~ at (27h)? dt (27h)? (2rh)3 
dn géina 1 dna 
= } = 1. 
dt” dt @ dt a 
Integrated over momenta, the collisional Boltzmann equation (31.35) is thus 
1 dna g 4rp?dp 
= = | CI =. 31.37 
a® dt f Uf] E(27rh)3 ( ) 


Equation (31.37) holds for both massive and massless particles. In the absence of collisions, C[f] = 0, the 
e 


3 


integrated Boltzmann equation (31.37) shows that proper number density n decreases as a 
nxa?. (31.38) 


Equation (31.38) says that the number na? of particles in a comoving volume remains constant in the absence 
of collisions that destroy or create particles. 


31.6 Collisions 


For a 2-body collision of the form 
1420 3+4, (31.39) 


the rate per unit time and volume at which particles of type 1 leave and enter an interval d°p, of momentum 
space is, in units c=h=1, 


3, 
Ctl E = fMLP) fifo(l ¥ fa\(1¥ fa) + fafa = MUFA] 


gıdďpı gdp? dps d*p4 
47 QB, (277)? 2B (27)3 2E3(2n)3 2B (27)? 


(2m)*55 (pı + po — ps (31.40) 
All factors in equation (31.40) are Lorentz scalars. On the left hand side, the collision term C[f1] and the 
momentum 3-volume element d?pı /E are both Lorentz scalars. On the right hand side, the mean amplitude 
squared (|M|?), the various occupation numbers f;, the energy-momentum conserving 4-dimensional Dirac 
delta-function ô$ (pı + p2 — p3 — pa), and each of the four momentum 3-volume elements d%p;/(2E;), are 
all Lorentz scalars. The factor of 1/2 in each momentum element d3p;/(2E;) has its roots in quantum field 
theory, where it serves to normalize propagators of quanta correctly, equation (??). 

The first ingredient in the integrand on the right hand side of the expression (31.40) is the Lorentz-invariant 
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mean scattering amplitude squared (|M|?), calculated using quantum field theory, §??. By convention (see 
for example equation (??) in §??), the mean amplitude squared (|M|?) represents a rate averaged over initial 
spin states and summed over final spin states, 


a l 2 
(M|) = AG So MÈ, (31.41) 


spins i 


so that the mean amplitude squared represents a rate per incoming spin state. To convert the mean amplitude 
squared to a net rate per unit time and volume, it is necessary to sum over particles in the initial states, 
which explains why equation (31.40) includes spin factors gı and go in the integral over initial momenta. The 
average-over-incoming spins factor 1/(gig2) in the mean amplitude squared cancels the sum-over-incoming 
spins factor gig2 in the integral (31.40). The convention to average over initial states when in the end they 
must be summed over may seem strange, but then so are many conventions. For a process involving 4 
particles such as (31.39), the mean amplitude squared (|M|?) is dimensionless (in units c = h = 1), but it is 
not dimensionless in general, equation (??). 

The second ingredient in the integrand on the right hand side of expression (31.40) is the combination of 
rate factors 


rate(1 +2 > 344) x AAF fF fa), (31.42a) 
rate(1 +2 + 3+ 4) xX fs fal foa fi) ; (31.42b) 


where the 1 + f factors are blocking or stimulation factors, the choice of + sign depending on whether the 
species in question is fermionic or bosonic: 


1 — f = FermiDirac blocking factor , (31.43a) 
1+ f = Bose-Einstein stimulation factor . (31.43b) 


The first rate factor (31.42a) expresses the fact that the rate to lose particles from 1 + 2 — 3 + 4 collisions 
is proportional to the occupancy fı fo of the initial states, modulated by the blocking/stimulation factors 
(1 F f3)(1 F fa) of the final states. Likewise the second rate factor (31.42b) expresses the fact that the 
rate to gain particles from 1+ 2 + 3+ 4 collisions is proportional to the occupancy f3f4 of the initial 
states, modulated by the blocking/stimulation factors (1+ f1)(1 F f2) of the final states. In thermodynamic 
equilibrium, the rates (31.42) balance, Exercise 31.5, a property that is called detailed balance, or microscopic 


reversibility. Microscopic reversibility is a consequence of time reversal symmetry. 

The final ingredient in the integrand on the right hand side of expression (31.40) is the 4-dimensional 
Dirac delta-function, which imposes energy-momentum conservation on the process 1+ 2 + 3+ 4. The 
4-dimensional delta-function is a product of a 1-dimensional delta-function expressing energy conservation, 
and a 3-dimensional delta-function expressing momentum conservation: 


(27)*64 (pı + p2 — p3 — p4) = 27 ôp(Eı + Ez = E3 = E4) (27)353,(p1 + p2 =p; = pa) a (31.44) 
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Exercise 31.5. Detailed balance. 
1. Show that the rates balance in thermodynamic equilibrium, 


ARAF BF fa) = fafa + ACF fo). (31.45) 


2. Conclude that, if each particle type i has a thermodynamic distribution with its own temperature T; 
and chemical potential u;i, then 


— ARAF HAF fa) + fafa F AOF fa) 


Ey pa | Eg-po  —E3t+p3 _ =E + pg 
= 1 1 14 1.4 
ARAFAUFH |-14 exp (AEM 4 Be a i (31.46) 
Solution. 
1. Equation (31.45) is true if and only if 
h k _ h f (31.47) 
YflFfe 1Ffs1Ffa 
But 
—* = o(-Et+n)/T 1.4 
IFF e ; (31.48) 
so (31.47) is true if and only if 
Ey + m Bo+p2 —E3+py3 —E4 + pa 
H = 1.4 
T T T i T , ene 
which is true in thermodynamic equilibrium because 
Ei + E2 = E3 + Ey, , Hı + H2 = u3 + pa. (31.50) 


31.7 Non-equilibrium recombination 


At times well before recombination, the ionization state of the baryonic gas was well described by ther- 
modynamic equilibrium. However, as recombination approached, the recombination rate could not keep up 
with the adiabatic decrease in density and temperature. Consequently recombination was delayed slightly 
compared to what would be expected in thermodynamic equilibrium. To model the CMB precisely, it is 
necessary to worry about the details of non-equilibrium recombination. 

Although the ionization state was out of equilibrium, elastic collisions between electrons, ions, and neutrals 
kept the velocity distributions of electrons and baryons in mutual thermodynamic equilibrium at a common 
kinetic temperature Te = Th. 

Recombination to and photoionization out of bound state i of hydrogen destroys and creates a free electron. 
The electron collision integral C;[fe] corresponding to this process is given by, from equation (31.40) with 
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stimulated processes from protons, electrons, and hydrogen atoms neglected because of their small occupation 
numbers, 


3, 
Cite) a = | MPH ffel A) + fih 


Wd pp gedpe dpi dy 
2Mp(27)3 2Mme(2T)3 2Mp (27)? 2p, (Qa)? ` 


(21)*65 (Pp + Pe — pi — Py) (31.51) 
The —fpfe term in the integrand corresponds to direct recombination, the — fpfefy term to stimulated 
recombination, and the fify term to photoionization. Because the proton and hydrogen atom are so massive, 
they remain essentially at rest during a recombination or photoionization, so the mean squared amplitude 
(IM|?); for photoionization out of and recombination into bound state i is essentially independent of the 
proton and hydrogen momenta. Integrating the collision integral (31.51) over the proton and hydrogen 
momenta yields 


1 d?p 
Cilfe] = Z fimp: 2rôp(Ep + Ee — Ei — Ey) |- np fell + fy) + rilgp/9) fy] Ip rF , (81052) 
one of the integrations over momenta being swallowed by the momentum-conserving Dirac delta-function 
(27)353, (Pp+Pe—pi—p;). Again because the proton and hydrogen atom are so massive, the photon is emitted 
and absorbed isotropically. Integrating over directions p, of the photon momentum yields 47. Integrating 
over the photon energy p, swallows the energy-conserving delta-function, yielding 


Cilfe] = (MI?) [= mp fell + fy) + i(9p/ 91) f] - (31.53) 


7 16rm? 


If the hydrogenic state 7 is in energy level n, then energy conservation requires that the energy E, = p, of 
the photon be the sum of the electron kinetic energy and the binding energy (ionization energy) of the level, 


+ En = Dy . (31.54) 


In the situation of cosmological recombination under consideration, the photons, whose numbers overwhelm 
those of electrons, have a thermal (Planckian) momentum distribution at temperature T,. Elastic collisions 
between electrons keep their distribution close to thermal (Maxwellian). Since electron energies redshift 
faster than photon energies, p2/(2m) œ a~? 
that of photons. However, electron-photon collisions keep the electron temperature closely equal to the 


versus py x a‘, the electron temperature is slightly below 


photon temperature, Te = T}, up to and through recombination. After recombination, electron-photon 
collisions become rare enough that the electron kinetic temperature drops below the photon temperature 
(Scott and Moss, 2009). For completeness, the treatment in this section allows different electron and photon 
temperatures, although the two temperatures will be set equal in subsequent sections. 

Substituting the Boltzmann distribution (31.15) at temperature Te for the electron occupation number fe, 
and the Planckian distribution (10.129) at temperature Ty for the photon occupation number fy, brings the 
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electron collision integral (31.53) to 


2 
2, | Npne(1/ge)(MeTe/27) 3 e- Pe/CMeTe) T ni(gp/gi)e P/T 
-1 — i 1. 
Cili = agra MP) | o (31.55) 
Finally, integrating the collision integral (31.55) over electron momenta gives 
Cil aei = T) + an CTE T. 31.56 
[fe] mp (2T)? = — npne [a;l e) +a ( e) | + ni 6; ( y) ? ( è ) 


where a; (Te) and as™(T., Ty) are an averaged direct and stimulated recombination rate coefficients 
to state i, and 3;(T,) is the photoionization rate coefficient out of bound state i. The direct recombination rate 
ai(Te) depends only on the electron temperature Te, while the photoionization rate 6;(T,) depends only on 
the photon temperature T}. The stimulated recombination rate ast” (T., T4) depends on both temperatures. 


In cosmological recombination, stimulated recombination is a small correction of order e~£"/7, which can 
be neglected. If stimulated recombination is neglected, then detailed balance imposes 
Npn IpG ele E,,/T 
i(T) = a,;(T) | =—— =a,(T c k ae ae 31.57 
Bsa (TH) = ann) Be (FE) e (31.57) 


The Boltzmann equation for electrons, equation (31.37), is a sum over recombinations to and photoion- 
izations out of bound states i, 


1 dna? 


p db Mie 5 [ai (Te) + a8 (Te, Ty)] + 2 nibilTy) - (31.58) 


4 


Let X; denote the ratio of the number density of species i to the nuclear proton density n4, 
cee ~ (31.59) 


The ratio is defined so that the electron fraction is unity, Xe = 1, when the plasma is fully ionized. Since 
n,a? is constant as the Universe expands, the Boltzmann equation (31.58) can be written as an equation 
for the evolution of the electron fraction, 


A = — XpXenş X [oi (Te) + af™ (Te, Ty) ]+ 0 am . (31.60) 


Equation (31.60) gives the rate of change of the electron fraction for a pure hydrogen gas. If other elements 
are included, notably helium, additional processes of recombination to and ionization out of bound states of 
those elements should be adjoined. 


31.8 Recombination: Peebles approximation 


Recombination is dominated by hydrogen, the dominant chemical element. The second most abundant ele- 
ment is helium, which is largely neutral by the time of recombination. Peebles (1968) argued that the overall 
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hydrogen density and the predominance of hydrogen atoms in the ground state, equation (31.22), would have 
the consequence that the gas would be optically thick to Lyman transitions, that is, to transitions to the 
ground state, but optically thin in transitions to excited states. Consequently any continuum or line Lyman 
photon emitted as a result of a recombination or transition to the ground state would be quickly absorbed. 
On the other hand radiative transitions between excited levels n > 2 would proceed rapidly without hin- 
drance, leading to a thermal distribution among the excited levels. Since the dominant excited level would 
be n = 2, equation (31.22), Peebles (1968) argued that recombination could be approximated by a 3-level 
system consisting of protons and of n = 2 and n = 1 levels of hydrogen. 

Since transitions from the continuum to n = 1 were ineffective, the rate of change of the proton fraction 
Xp =N,/n+ was dominated by recombinations to and photoionizations out of the n = 2 level, 


dX. 
a — XpXen pag + X2Bo . (31.61) 


Equation (31.61) ignores stimulated recombination, which is a e~”2/7 < 1 correction to the rate. 


Peebles (1968) argued that successful recombination to the n = 1 ground state would be dominated by 
slow leakage out of the n = 2 level, which occurred by two processes. The first process is 2-photon decay out 
of the 2s state, which occurs at a rate Ag, = 8.22458s—!. The second process is that, although most decays 
out of the 2p state produced a Lyman a photon that was immediately absorbed by a nearby hydrogen atom, 
the expansion of the Universe redshifted the emitted photon, and a small fraction Ps of the emitted Lyman a 
photons succeeded in redshifting out of the line without being reabsorbed. The fraction Ps of emitted photons 
that escape in an expanding medium can be approximated using the Sobolev formalism, §31.10. Thus the 
rate of change of the fraction X; = n1/n+ of hydrogen atoms in the ground n = 1 level is 


dX 
Ta = X242 — Xi Bı? , (31.62) 
where the effective spontaneous decay rate A; from the n = 2 levels to the ground n = 1 level is 
Aai = As, 4 I Apis P3 ; (31.63) 
92 92 


with Ps the Sobolev escape probability given by equation (31.101). Equation (31.63) assumes that 2s and 
2p are populated in the ratio (g2s/92) : (g2p/92) = ¢ : ł of their statistical weights. The value of the spon- 
taneous decay coefficient A2p—1s itself is not actually needed since it cancels in the Sobolev approximation, 
equation (31.102), 


E TET,- SEE a 31.64 
92 2p—1s+ S Xing 92 ae , ( s ) 


where H is the Hubble parameter. The statistical weight factor is gi /g2 = ;- 
Detailed balance requires that dXp/dt, equation (31.61), must vanish in thermodynamic equilibrium (TE), 
so the ratio of photoionization to recombination rate coefficients must be 


3/2 
b2 z (=) a InIJe (z5) e7 2/T f (31.65) 


a2 nə a g2 \2rh? 
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the statistical weight factor being gpge/g2 = 4. Hence equation (31.61) may be written 


dX, 1 
Pa = Xp Xena (- 1+ =) ; (31.66) 
p2 


where the departure coefficient bp2 is the value of npne/no relative to its value in thermodynamic equilibrium, 


—3/2 
boo = Tne J) _ ApXen+ 92 (z5) i eE2/T (31.67) 
j n2 N2 / TE X2  Ipge \2ah* 


Similarly, detailed balance requires that dXı/dt, equation (31.62), must vanish in thermodynamic equilib- 
rium, so the ratio of radiative excitation to decay rate coefficients must be 


B X 
= (=) = BeEn/T | (31.68) 
A21 Xij A 
with E12 = E — EF, and g2/gı = 4. Thus equation (31.62) may be written 
dX 
ae = Xı Bi2(b21 — 1) , (31.69) 
where the departure coefficient b21 is the value of n2/n1 relative to its value in thermodynamic equilibrium, 
X 
by = 2 (=) = 2291 En/T (31.70) 
NI nM) TE Xı 92 


Since the population of the n = 2 level was so much smaller than the populations either of protons or of 
the ground n = 1 level of hydrogen, Peebles (1968) argued that the rate of change of X2 must be negligible 
relative to the rates of change of Xp and X4, 

dX» dX, dX 


ae T = XpXen+a2 — X2p2 — X2A21 + Xi Bi S0. (31.71) 


The approximation (31.71) of vanishing dX2/dt allows Xə to be eliminated in favour of X4, 


X: XpXe/ X B 
2 _ AK A + Bio (31.72) 
Xı Bz + A21 
Given the detailed balance relations (31.65) between (2 and a2, and (31.68) between Bı2 and Agi, the 


relation (31.72) may also be written as an expression for the departure coefficient b21, 


ba m PU A L.. (31.73) 


in terms of the departure coefficient bp1, the value of npne/nı relative to its value in thermodynamic equi- 


3/2 
ine JC) a ae (25) È BT, (31.74) 
P nı ni J op Xi OpGe \ 20h? 


librium, 
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Figure 31.2 Non-equilibrium hydrogen ion fractions as a function of cosmic scale factor a scaled to ag = 1. The total 
hydrogen fraction, Xy = fy = 0.86, is the fraction of nuclear protons that are hydrogen nuclei, equation (31.83). 


The statistical weight factor is 9) /(gpge) = 1. Equation (31.73) allows the departure coefficients bj, and 
bp2 = bpi/b21 in the recombination equations (31.66) and (31.69) to be eliminated in favour of bp1, yielding 


dln Xp Ag 1 
= Xe 1), 31.75 
dt Ne? Bask ot (5 ) (Aiaren 
dln Xı b2 
=B b 1). 31.75b 
dt 12 b2 ra Aoi ( pl ) ( ) 


Equations (31.75a) and (31.75b) combine to give d(X,+-X1)/dt = 0 in accordance with the condition (31.71), 
but each of equations (31.75) is written in a form that remains finite as respectively X, — 0 and X; > 0. 
Equations (31.75) combine to give the rate of change of the logarithmic departure coefficient In bp1, 


(31.76) 


dln bpı dinX, dlnXe dinxX, d _3 
= Rol peg |, 
dt d ae dt dt n (n+ i ) 


Given that helium is largely neutral by the time of recombination, charge conservation in the pure hydrogen 
gas implies that 


Xe = Xp, (31.77) 
so the time derivative of In Xe in equation (31.76) is the same as that for In X,. The time derivatives of the 


temperature and density follow from T œx a~' and ny x a~%, equations (31.1) and (31.4). The differential 
equation (31.76) is stiff. Near thermodynamic equilibrium, the logarithmic departure coefficient In bp1 is near 
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Figure 31.3 Logarithmic ionized-to-bound level departure coefficients ln bpı and In bp2, equations (31.74) and (31.67). 
Logarithmic departure coefficients are zero in thermodynamic equilibrium (the departure coefficients themselves are 
unity). The increasingly positive values of ln bpı and lnbp2 mean that protons become over-abundant compared to 
thermodynamic equilibrium, that is, recombination is not keeping pace with the cosmological decrease in tempera- 
ture. The n = 1 ground level is further from thermodynamic equilibrium with protons than the n = 2 and other 
excited levels. When first coming out of thermodynamic equilibrium, the logarithmic departure coefficient Inb, is 
approximated by its steady state value A/«, equation (31.81), indicated by the dotted line. 


zero, and then the factors involving bp; on the right hand sides of equations (31.75) become small, 
—-—1=e°™_1aw—Inby, bp — 1 = eh —1 Indy . (31.78) 


The dln X;/dt derivatives in equation (31.76) are then proportional to 1n bpı with a negative coefficient —«, 


dlo Xp dlìnXe dinxX, 


7 H Ti ~ Kln bp - (31.79) 
Thus the differential equation (31.76) takes the form 
dinb 
— ~~ —KMbp $A, (31.80) 


where the forcing term À is the remaining, last, term on the right hand side of the differential equation (31.76). 
The « term tends to drive ln bpı exponentially to zero, that is, into thermodynamic equilibrium, while the 
forcing term À drives Inb,; away from zero. The differential equation (31.80) is stiff when « is much larger 
than the absolute value of A. 

A solution to the stiffness problem is to evaluate the thermodynamic equilibrium value of «/|A|, and if it 
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exceeds some threshold (say 10), then set In b, to the steady state solution of equation (31.80), which is 
À 
nbp xi. (31.81) 
K 


A differential equation solver that can cope with stiff equations will in effect impose the solution (31.81). 
Given the logarithmic departure coefficient In bp1, the neutral X; and ionized X, hydrogen fractions follow, 
in a form that remains numerically well-behaved even when X, or X, is tiny, as 


Af2e4 2 
= fae | E EE (31.82) 


(1+ V1 +4 fue)” ’ 1+ J/1+ 4 fae? ’ 


where fu, 
fa = nu/ng =1- fa/fe 5 (31.83) 


is the fraction of nuclear protons that are hydrogen nuclei, and q is 


l Xı gı MeT =3/2 
=In = In |n} =— 
1 XpXe + Jpge 2rh? 
The statistical weight factor is g1/(gpge) = 1. 
Only when «/|A| falls below the threshold is it necessary to start solving the differential equation numeri- 


cally. It is better to solve directly for the proton fraction Xp rather than the departure coefficient bp1, since 
as recombination freezes out, X, changes slowly, whereas bpı continues to evolve, and solving for X, from 


+ Indy . (31.84) 


by; becomes numerically unstable. The differential equation governing X, is, from equation (31.75a), 


dX, 


dt 


= 92 o (E2—F1)/T ) A21 
=| X% 2AM" — Xe X, ——— |. 31.85 
( ! gı p PEDE b2 + Agr ( ) 


The statistical weight factor is g2/gı = 4. Figure 31.2 shows the resulting non-equilibrium H ion fractions, 
and Figure 31.3 shows the logarithmic departure coefficients In bpı and In bp2. Exercise 31.6 asks you to write 
code to solve equation (31.85). 


Exercise 31.6. Recombination. Write code that implements the recombination of hydrogen. 

1. Well before recombination, the ionization state is near ionization equilibrium. As suggested in the text, 
calculate the coefficients « and A that go into equation (31.80) in thermodynamic equilibrium. If «/|A| 
exceeds some threshold, then set the logarithmic departure coefficient to In bp; = A/&, equation (31.81). 
Thence deduce the ionization fractions X; and X,, equation (31.82). 

2. Once «/|A| falls below the threshold, solve the evolution equation (31.85) for the proton fraction Xp 
numerically. 

Solution. See Figures 31.2 and 31.3. 
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Seager, Sasselov, and Scott (1999) provide an improved approximation for recombination based on the Peebles 
(1968) approximation, but with the inclusion of helium, and with the n = 2 recombination coefficients 
adjusted to fit the results of a detailed calculation of recombination by Seager, Sasselov, and Scott (2000) 
that includes explicit treatment of up to 300 levels of H, 200 levels of He, and 100 levels of Het, plus one 
each of e, p, H~, and He*™*, plus the ground levels of molecular hydrogen species Hy and He 

Seager et al.’s approximation has been refined by Wong, Moss, and Scott (2008) to include the semi- 
forbidden decay He 2p°P, — 1s tSo from the triplet 2p state of helium, and the scattering of He 2p + 1s 
photons by neutral hydrogen. Chluba and Thomas (2011) have developed an even more comprehensive 
approach to recombination. The various refinements affect the electron fraction Xe at the percent level. The 
present section follows the simpler work of Seager, Sasselov, and Scott (1999). 

Seager, Sasselov, and Scott (1999) adjoin to the hydrogenic recombination equation (31.85) an equivalent 
equation for helium, protons p being replaced by singly-ionized helium He* in its ground state. The effective 
spontaneous decay Aye: from the singlet n = 2 levels to the ground n = 1 level of neutral He is, analogous 
to the hydrogenic equation (31.63), 


GHe2s JHe2p 


Ape2s—1s + AHe2p—1s Ps e(FHe2s—Eno2p)/T . (31.86) 


gHe2 GHe2 


Axe21 = 


The extra factor of e(FĦe2s-FPue22)/T takes into account that the 2p state lies slightly but appreciably above the 
2s state in energy, so its population in thermodynamic equilibrium is reduced by a corresponding Boltzmann 
factor. The statistical weight factors are gHe2s/gHe2 = + and gHe2p/9He2 = 3. As in the hydrogenic case, 
equation (31.64), the value of AHe2p—1s cancels against the Sobolev probability Ps, equation (31.102), 
GHe2p 1 GHel 87 


Axe2p—1sPs = 


31.87 
JHe2 Xe n+ GHe2 Nicop— 1s ( ) 


the statistical weight factor being gue1/9ue2 = L. 

In thermodynamic equilibrium, Hett combines to Het at a redshift of z ~ 6,000, a factor of 6 higher 
than recombination, Figure 31.1. By the time recombination approaches, little Het* remains. He** is well- 
approximated throughout as being in thermodynamic equilibrium with Her. 

Charge conservation implies that the electron fraction density Xe is 


Xe = Xp + Xnet PON es . (31.88) 


The relevant atomic physics is as follows. The wavelengths of the 2 — 1 transitions of hydrogen and helium 
are 


AH2p—1s = 121.5682nm , AHe2p—1s = 58.4334nm , AHe2s—1s = 60.1404 nm . (31.89) 


Ionization energies of hydrogen and helium, commonly quoted in units of cm™1, are 


XH = 10,967,877.17cm7!, XHe = 19,831,066.9cm7! , XHe+ = 43,890,887.89 cm7! . (31.90) 
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Figure 31.4 Non-equilibrium hydrogen and helium ion fractions as a function of cosmic scale factor a scaled to 
ao = 1. Dotted lines show Xp and X},,+ in thermodynamic equilibrium. The total hydrogen and helium fractions are 
Xy = 1 — fn/ fy = 0.86 and Xay. = $ fn/f+ = 0.07. 


The spontaneous 2-photon 2s > 1s transition rates of hydrogen and neutral helium are 
Apos—is = 8.224588-1, Apteos—1e = 51.387). (31.91) 


The effective recombination rates to n = 2 levels of hydrogen and neutral helium are 


4.309 (T/104 K) 0-6106 
19 3 1 
ay2(T) = 1.14 x 10 I 0. 703 ( J104 \0-5300 m s 3 (31.92a) 


ane (T) = 10716-744 | VT/3K (1 + VTBR) (1 + VT/10°K) = pr , (31.92b) 


with p = 0.711. The factor of 1.14 in the hydrogenic recombination rate (31.92a) is a fudge factor introduced 
by Seager, Sasselov, and Scott (1999) that adjusts the Hummer’s (1994) calculated rate coefficient to achieve 
agreement with the multi-level numerical computation of Seager, Sasselov, and Scott (2000). The helium 
recombination rate (31.92b) is from Hummer and Storey (1998). The statistical weight factor that goes 
into the ratio BHe2/QHe2 Of photoionization to recombination rates for He, analogous to the hydrogenic 
ratio (31.65), is JHe+ Je/ JHe2 = 2x2/4 =1. 

Figure 31.4 shows the recombination of hydrogen and helium in the Seager, Sasselov, and Scott (1999) 
approximation. The Figure shows that the recombination of singly-ionized helium is, like the recombination 
of protons, delayed compared to thermodynamic equilibrium. Even so, helium is almost entirely neutral by 
the time of recombination, so in practice helium has little effect on recombination. 
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The Sobolev escape probability formalism applies to a uniformly expanding medium such as a FLRW uni- 
verse. Suppose that a photon is emitted in a transition 2 — 1 between two atomic levels 2 and 1. The 
line is narrow, but not infinitely narrow. As a result of natural and Doppler broadening (the specifics are 
unimportant here), the line is emitted with some line profile à, which can be taken to be normalized to 


co 
@dinrX=1. (31.93) 

A=0 
The emitted photon travels through the medium, and has some probability of being absorbed by other atoms 
in level 1 before the photon is redshifted out of the line. Since the line is narrow, the photon is either absorbed 
nearby, or else it escapes the line completely. In the approximation that the properties of the medium change 
little over the small distance between emission and absorption, detailed balance implies that the line profile 

for absorption is the same as that for emission. The cross-section g) for absorption at wavelength A is 


Oo. =O), (31.94) 


where o = le o> dln À is the cross-section integrated over the line profile. By detailed balance, the integrated 
cross-section is related to the Einstein coefficient A21 for spontaneous emission by 
1 g2 3 
o = — s Á21 . 31.95 
BTC g1 217721 ( ) 
The optical depth dz), the differential probability for the photon to be absorbed, as the photon passes 
through a distance dl = cdt is 


dt, =n 0) dl = nico Qy dt . (31.96) 
The medium is expanding with Hubble parameter H, and the photon wavelength A redshifts by dln ÀA = Hdt 


in time dt. Therefore the optical depth to absorption as the photon redshifts through an interval dln À of 
wavelength is 


dt, = Ts ġa dln à ; (31.97) 
where Ts is the Sobolev optical depth 
nico g2 3, A21 
= = 1. 
Ts F nı oH (31.98) 


The optical depth 7) for the photon to redshift from an emitted wavelength A to infinite wavelength is 
Tr =r f y dln A’ . (31.99) 
X 


The probability for a photon emitted at wavelength A to escape from the line without being reabsorbed is the 
exponential e~™ of the optical depth. The escape probability averaged over the emitted line profile defines 
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the Sobolev escape probability Ps, 


Poe | ed, dind= | exp (=r f dy dinx’) dina 
0 0 à 


a. (31.100) 
TS 
which is evidently independent of the shape of the line profile (just so long as the line is narrow). The Sobolev 
escape probability Ps varies from 0 as Tg > oo to 1 as Tg > 0. 
For large Sobolev optical depth Ts, the Sobolev escape probability approximates the reciprocal of the 
Sobolev optical depth (31.98), 


1 
Ps = = (tg > 1). (31.101) 
S 


The rate per unit time and volume at which photons are emitted and escape is then 


8rH 
nz Azı Ps = an is 
nı g2 Ad, 


(31.102) 


32 


Cosmological perturbations: the 
hydrodynamic approximation 


The simple model in Chapter 30 of the evolution of cosmological perturbations misses some processes that 
affect in observationally distinctive ways the power spectra of fluctuations both of the CMB and of the 
distribution of matter. 

The most important missing element is baryons, which were neglected in Chapter 30 on the grounds 
that baryons are gravitationally sub-dominant, having a density 0,/Q,. ~ 1/5 of the non-baryonic dark 
matter density. Photons and baryons are coupled by electron-photon scattering, which causes the photons 
and baryons to behave effectively as a single photon-baryon fluid prior to recombination. Baryons add mass 
density but no pressure to the photon-baryon fluid, reducing the sound speed of the photon-baryon fluid 
below its relativistic limit of /1/3, §32.4. The reduction in sound speed becomes greater as the ratio of 
matter to radiation density increases after matter-radiation equality. The baryon mass loading enhances 
compression (odd) peaks and weakens rarefaction (even) peaks in the power spectrum of the CMB, §32.10. 
The change in sound speed modifies the relation between the sound horizon and physical distance, resulting 
in observationally distinctive shifts in the locations of peaks as a function of harmonic number in the power 
spectrum of the CMB, Figure 34.7. After recombination, baryons decouple from the photons and behave 
like matter. Oscillations in the photon-baryon fluid at recombination produce an imprint, called baryon 
acoustic oscillations, in the matter power spectrum, Figure 32.4, analogous to the acoustic oscillations in 
the CMB power spectrum. 

A second important effect missing from the simple model of Chapter 30 is dissipation that results from the 
finite mean free path of electron-photon scattering, which causes photons and baryons not to be perfectly 
coupled, §32.7. Dissipation damps oscillations of the baryon-photon fluid at smaller scales, reducing power 
in higher order peaks in the CMB. 

A third modification is to treat neutrinos separately from photons, §32.11. Like photons, neutrinos are 
relativistic, but unlike photons, neutrinos stream freely. 

A varying sound speed, dissipation, and freely-streaming neutrinos, can all be modelled in a hydrodynamic 
approximation that treats the photon-baryon fluid, and the neutrinos, as imperfect fluids. An imperfect 
fluid is characterized by the first three moments of its momentum distribution, the monopole, dipole, and 
quadrupole, or equivalently the density, bulk velocity, and pressure, but unlike a perfect fluid the pressure is 
allowed to be anisotropic. Equations governing the anisotropy can be derived by appealing to a Boltzmann 
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Figure 32.1 (Left) Overdensities ô—3®, and (right) bulk velocities v in the hydrodynamic approximation as a function 
of cosmic scale factor a/deq, at wavenumber k/(aeq Heq) = 10, for non-baryonic dark matter (c), baryons (b), photons 
(y), and neutrinos (v). The cosmological model is the standard model adopted in this book, a flat ACDM model 
with concordance parameters Q4 = 0.69 and Qm = 0.31, and adiabiatic initial conditions, §32.3. The overdensities 
and velocities of relativistic species are related to their monopole and dipole moments by 6, — 3® = 3(@o — 9), 
dy — 38 = 3(No — 8), vy = 301, vv = 3M1. The results may be compared to those in the simple approximation, 
Figure 30.2, and from a Boltzmann computation, Figure 33.1. 


treatment, Chapter 33. Given the anisotropy, the evolution of the density and bulk velocity of an imperfect 
fluid is governed by the equations of conservation of its energy and momentum. 


The approximate anisotropic pressure in the hydrodynamic approximation is not sufficiently accurate to 
provide a reliable source for the difference Y — © in scalar gravitational potentials. Thus in the hydrodynamic 
approximation, as in the simple approximation, the two scalar potentials are set equal, UV = ®. 


Figure 32.1 shows the overdensity and bulk velocity of the 4 species, non-baryonic dark matter, baryons, 
photons, and neutrinos, calculated in the hydrodynamic treatment of this Chapter, as a function of cosmic 
scale factor, in a flat ACDM cosmological model at an illustrative wavenumber k/(deqHeq) = 10. Figure 32.2 
shows photon and neutrino multipoles up to the quadrupole £ = 2, the largest multipole computed in the 
hydrodynamic approximation. The hydrodynamic approach yields a fair approximation to more accurate 
calculations that follow higher order multipole moments of the photon and neutrino distributions using the 
Boltzmann equation, Chapter 33. 


This Chapter starts, §32.2, with a summary of the equations in the hydrodynamic approximation. The 
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Figure 32.2 (Left) Photon and (right) neutrino multipoles in the hydrodynamic approximation as a function of cosmic 
scale factor a/deq, at wavenumber k/(aeqHeq) = 10. The cosmological model is the same as in Figures 32.1-33.1, 
§32.3. The multipoles may be compared to those from a Boltzmann computation, Figure 33.2. 


remainder of the Chapter is concerned with finding approximations to the hydrodynamic system of equa- 
tions (32.6)—(32.13), so as to gain a physical understanding of their solutions. 

Section 32.4 presents the tight-coupling approximation, which effectively treats photons and baryons as a 
single fluid with a common bulk velocity. The tight-coupling approximation, valid well before recombination, 
treats the photon-baryon fluid as a perfect fluid, as in the simple approximation of Chapter 30, but the mass 
density contributed by baryons reduces the sound speed of the fluid. 

Sections 32.6—32.10 examine the consequences of allowing quadrupole anisotropy in the photon distribution 
(shear viscosity), and a small velocity difference between photons and baryons (heat conduction), both of 
which lead to dissipation. 

Section 32.11 considers neutrinos, which stream freely. After recombination, photons also stream freely, as 
do baryons. 


32.1 Electron-photon (Thomson) scattering 


For some time before and after recombination, photons and baryons were coupled principally by nonrelativis- 
tic electron-photon (Thomson) scattering. The inverse comoving mean free path ly 1 to Thomson scattering 
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is 

I; = fieora , (32.1) 
where ør is the Thomson cross-section. The Thomson cross-section is proportional to the square of the 


classical electron radius re, 


8T e? 
ates esn i (32.2) 


The inverse comoving mean free path lee is evaluated in Exercise 32.1. In calculating fluctuations in the 
CMB, Chapter 34, it is convenient to introduce the (dimensionless) Thomson scattering optical depth 7, 
which starts from zero, T = 0, at the present time, and increases going backwards in time 7 to higher 
redshift, 


no 
r= necra dn . (32.3) 
n 


The conformal time derivative of the Thomson optical depth 7 equals minus the inverse comoving mean free 
path, 


= -nera . (32.4) 


Exercise 32.1. Thomson scattering rate. Let f} be the proton fraction (31.5), and Xe be the ionization 
fraction (31.3). Show that the (dimensionless) ratio of the inverse comoving electron-photon (Thomson) mean 
free path lẹ 1 = ñera to the inverse comoving Hubble distance aeqĦHeq/c at matter-radiation equality is 


CNeOTA _ 3cop f+ XeHeg Qn a \ 7? 
QeqHeq E 167Gmy Qm a 


Qe 


H, Qp a =g 
= 0.033 h f} Xe = — = 500 Xe 
o eel 


K J , (32.5) 


aeq 
the Hubble parameter Heq at matter-radiation equality being related to the present-day Hubble parameter 
Ho by equation (30.42). 


32.2 Summary of equations in the hydrodynamic approximation 


The hydrodynamic approximation is derived by suitably truncating the full set of Boltzmann equations, 
§33.1, at the quadrupole moment. The equations governing the evolution of scalar fluctuations in non- 
baryonic cold dark matter, baryons, photons, and neutrinos at comoving wavenumber k in the hydrodynamic 
approximation are as follows (compare to the equations in the simple approximation, §30.7, and in a full 
Boltzmann treatment, §33.1). The equations for non-baryonic cold dark matter (c) follow from conservation 
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of energy-momentum, and are the same as those (30.53) in the simple approximation (recall that overdot 
signifies the derivative d/dn with respect to conformal time), 


bo —kve -36=0, (32.6a) 
te tov tkW=0. (32.6b) 
a 
Equations for baryons (b) are similar to those (32.6) for the non-baryonic dark matter, except that photon- 


electron scattering causes a transfer of momentum between photons and baryons when their bulk velocities 
are not equal, 


dp —kvp -3 =0, (32.7a) 
I+] 


R T391) , (32.7b) 


: a 
Vb + -vp + k Y = 
a 


were R is 3 the baryon-to-photon density ratio, equation (32.46). The equations of conservation of energy 
and momentum of photons (y) are 


Op, —-k 0, -@=0, (32.8a) 


. k k 1 
O14 3 (Oo — 202) 4 t= ri (vp — 301) . (32.8b) 


The photon quadrupole moment ©, can be approximated by an expression that interpolates between the 
tight-coupling limit |t| > ks, equation (33.83), and the free-streaming limit |t| < ks, equation (33.84), 
where ks is an interpolation constant, which numerical comparison to full Boltzmann computations indicates 
is adequately approximated by twice the inverse Hubble distance at recombination, ks œ% 2drecHrec (or 
ks % GeqHeq, for standard ACDM cosmological parameters), 


o _ 1 |+|? otisht r ofree 32 9 
2 IAE A a a a) 
i 8k reg 3 
o$ ght == Ae ; of = (Qo H Vv) m . (32.9b) 


As commented after equation (32.67), the factor È in equation (32.9b) includes the effect of polarization; 
without polarization, the factor is 5. Energy-momentum conservation of neutrinos (v) implies 


No-kN, -&=0, (32.10a) 


N+ F (No — 2M2) 4 Ey =0. (32.10b) 


The neutrino quadrupole Ny may approximated by, equation (34.50), 


M =- (Mo + Y) N. (32.11) 
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The Einstein energy equation is 
— k°b— 3° F = 4rGa? (Pebe + odo + 40400 + 45L No) , (32.12) 
a 


where F is defined by equation (30.56). The non-vanishing photon and neutrino quadrupoles O2 and M2 
are a source for the difference Y — © in scalar gravitational potentials, equation (29.49d). However, the 
hydrodynamic approximations (32.9) and (32.11) are not sufficiently accurate to serve as a reliable source 
for Y — ®. Therefore in the hydrodynamic approximation the two potentials are set equal, as in the simple 
approximation (30.58), 


V=6. (32.13) 


Exercise 32.2. Program the equations in the hydrodynamic approximation. Upgrade the code you 
wrote in Exercise 30.11 to implement the hydrodynamic approximation, equations (32.6)—(32.13). Explore 
the evolution of the gravitational potential ®, and of the 4 species of mass-energy, non-baryonic dark matter, 
baryons, photons, and neutrinos. You will upgrade this code to a Boltzmann code in Exercise 33.1. 
Solution. See Figures 32.1, 32.2, and 32.3. 
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Figure 32.3 (Left) Overdensities 6, —3®@ and 6, —3® of non-baryonic dark matter (brown) and baryonic matter (green), 
and (right) radiation monopole Qo — 3® (blue), and minus twice the scalar potential, —2W (black), as a function 
of cosmic scale factor a in the hydrodynamic approximation. Curves are labelled with the comoving wavenumber 
k/(d@eqHeq) in units of the Hubble distance at matter-radiation equality. The results may be compared to those in the 
simple approximation, Figure 30.1, and using a Boltzmann computation, Figure 33.3. 
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Figure 32.4 Model matter power spectrum computed in the hydrodynamic approximation, compared to observations 
from the North (N) and South (S) Galactic Caps of the the Sloan Digital Sky Survey IV (Gil-Marin et al., 2020). The 
predicted power spectrum has been multiplied, arbitrarily, by a bias factor of b? = 1.32?. The model power spectrum 
may be compared to those computed in the simple approximation, Figure 30.15, and from a Boltzmann computation, 
Figure 33.5. 


Exercise 32.3. Power spectrum of matter fluctuations: hydrodynamic approximation. Upgrade 
the code you wrote in Exercise 30.16 to compute the power spectrum of matter fluctuations in the hydrody- 
namic approximation. Comment on how the power spectrum differs from that in the simple approximation. 
Solution. See Figure 32.4. The cosmological model is the standard flat ACDM model described in §32.3. 
The model power spectrum differs from that in the simple approximation, Figure 30.15, firstly in that power 
is slightly reduced at smaller scales (larger wavenumbers k), and secondly in that the power spectrum shows 
wiggles, commonly called baryon acoustic oscillations, or BAO. Both effects arise from the finite contribution 
of baryons to the matter power spectrum. 

The possibility of scale-dependent bias between galaxies and matter, coupled with the effects of nonlinear 
growth of power, complicates the relation between the observed galaxy power spectrum and the linear matter 
power spectrum. BAO persist in the presence of scale-dependent bias, providing a cosmic ruler that links the 
comoving scale of distance in galaxy clustering to that in the CMB. Gil-Marín et al. (2020) were interested 
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primarily in the scale of the BAO. In Figure 10.4 they allowed for possible scale-dependent bias and incipient 
non-linearity by applying a more or less arbitrary polynomial correction to the model power spectrum. 


Exercise 32.4. Effect of massive neutrinos on the matter power spectrum. 
1. Incorporate 1 or more species of massive neutrino into the Friedmann equations describing the evolution 
of the background FLRW geometry. 
2. Compute the effect of 1 or more species of massive neutrino on the matter power spectrum. For simplicity, 
assume an abrupt transition of the neutrino evolution equations from relativistic to non-relativistic. 
Solution 
1. Neutrinos decouple while relativistic, at around eé-annihilation, and inherit a relativistic thermodynamic 
distribution from that time. Since decoupling, neutrinos free-streamed, with particle momenta p and 
temperature T redshifting as p «x T œx a~+. The energy density p(m,T) of a single species of neutrino 
of mass m at temperature T is (units c = 1) 


ce dp _ Tn?T4 
T) Vp R T 32.14 
oe -f ary (Qrh)3 24073 Nn) at) 


where R(u) is the integral 


wer ft Ve P i T ok (32.15) 


Tn em +1 au Loo, 
with 
180 ¢(3) 
The neutrino pressure p(m, T) (not to be confused with neutrino particle momentum p) is 
1 4rp?dp 
32.17 
<5 JETA ara oa ie 


which can be expressed in terms of the neutrino density p(m,T) as 


Op(m, T) 


ae (32.18) 


pem T) = & [øm T) - 


An approximation good to 1% for the density p(m,T), and which yields an approximation good to 4% 
for the pressure p(m,T) given in terms of p(m,T) by formula (32.18), is 


1 + By? + ya? put 
R(u) ~ i TF yp ; (32.19) 


where the constants a (equation (32.16)), 8, y are chosen such that both density p and pressure p have 
the correct asymptotic behaviour at both u — 0 and u > œ, 

10 [— 77? + 3240 ¢(3)°] 
4978 — 486000 ¢(3)¢(5) 


10 
B= = +7 =0.2902, y= 


= = 0.1454 . (32.20) 
TT 
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A simple approximation that reproduces the correct asymptotic behaviour of the density p(m, T) at large 
and small temperature is to adopt an abrupt change from relativistic to non-relativistic at T = am, 


p(m, T) (32.21) 


~ 940 R3 


TTT? T T>am, 
am T<am. 


2. The approximation (32.21) for the neutrino density suggests adopting an abrupt transition of the neu- 
trino evolution equations from relativistic, equations (32.10) and (32.11), to non-relativistic at T = am, 
with a from equation (32.16). The non-relativistic equations are as for non-baryonic cold dark matter, 
equations (32.6). Conservation of energy and momentum at the transition requires that the neutrino 
overdensity ô, and bulk velocity v, are 


dv = 3No , (32.22a) 
vy = 3M . (32.22b) 


32.3 Standard cosmological parameters 


Unless otherwise stated, all computations of cosmological perturbations carried out in this book are for 
a standard flat ACDM cosmological model with parameters consistent with those reported by the Planck 
collaboration (Aghanim et al., 2018). This section gives the standard parameters adopted in this book. 
The CMB power spectrum constrains the physical density Qh? of dark matter and baryonic components 
more precisely than the density Q relative to the critical density. The physical matter densities Q,h? of 
non-baryonic cold dark matter and 0h? of baryonic matter today are taken to be, in the standard model, 


Qh? = 0.12, Oph? = 0.022 . (32.23) 
The conversion factor between Qh? and mass density p today is 


_ 30H? 
— 8rGe? 
The matter density Qm in non-baryonic cold dark matter and baryonic components today is taken to be, in 
the standard model, 


p = 6.44932 x 107” Oh? kgm? . (32.24) 


Om = 0.31 . (32.25) 

The individual non-baryonic cold dark matter and baryonic densities are then 
Qe = 0.262, Qp = 0.048 . (32.26) 

Together, equations (32.23) and (32.25) yield a Hubble parameter Ho today of 


Ho = 100hkms~* Mpc"! = 67.7kms~' Mpc™! . (32.27) 
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The CMB temperature To today is (Fixsen, 2009) 
To = 2.7255K , (32.28) 
implying a physical photon density of, Exercise 10.2, 
Qh? = 2.4728 x 10> . (32.29) 


The standard model adopted here assumes Neg = 3 species of massless neutrino that decouple just be- 
fore electron-positron annihilation, implying that the neutrino temperature after eé-annihilation is T,/T, = 
(4/11)!/3, Exercise 10.20. The energy-weighted effective number of relativistic particle species at recombina- 
tion is then, equation (10.152b), 


=? 
Io il 8 


ie 
i (=) L Nagl = 3.36 . (32.30) 


In reality, neutrinos are not quite decoupled by eé-annihilation. In a more accurate treatment, the neutrino 
temperature after eé-annihilation is slightly larger than T,/Ty = (4/11)'/3, and the effective number g, of 
relativistic species at recombination is correspondingly slightly larger. It is conventional to quote the increase 
in gp as if it were an increase in the effective number of neutrino types in equation (32.30), New = 3.04 
(Mangano et al., 2002). In the approximation of Neg = 3 massless neutrinos adopted here, the physical 
density of neutrinos today is 


Ob = 1.68 x 107 . (32.31) 


The ratio of the physical matter density from equations (32.23) to the physical radiation density implied by 
equations (32.29) and (32.31) implies a redshift of matter-radiation equality of 


1+ Zeq = 3415 . (32.32) 


If neutrinos have masses as indicated by neutrino oscillation data, §42.4.15, then at least 2 of the 3 neutrino 
species are non-relativistic today, even though they were relativistic at recombination. If the third species is 
taken to be massless, then the neutrino masses are 


My, =OeV, ma =0.01leV, mz, =0.05eV . (32.33) 
The corresponding physical neutrino mass density today is, in place of equation (32.31), 
Qh? = 1.3 x 107°. (32.34) 
The assumption of spatial flatness implies vanishing spatial curvature, 
Qk =0. (32.35) 


In the standard ACDM model, the remaining density is taken to be vacuum energy, equivalent to a cosmo- 
logical constant, with density 


QA = 1 — Qk — Nm — Q, = 0.69 . (32.36) 
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Recombination is affected by the helium mass fraction Yaye = p4He/(PH + p4He), taken to be (Cyburt et al., 
2016) 


Yaye = 0.25 . (32.37) 


Given the helium fraction (32.37) and the Peebles approximation to recombination, §31.8, along with the 
standard parameters adopted here, the redshift of recombination, where the Thomson optical depth is unity, 
is 


1+ Zee = 1092 . (32.38) 


In integrating the simple or hydrodynamic or Boltzmann equations, §30.7 or §32.2 or §33.1, I find it 
convenient to work in units where the scale factor and Hubble parameter are one at matter-radiation equality, 
aeq = Heq = 1. With the standard parameters adopted here, the scale factor and Hubble parameter today 
are related to those at matter-radiation equality by 


H 
1 3415, <= = 6.363 x 107°. 32.39 
H 


Geq eq 
The scale factor and Hubble parameter at recombination, where the Thomson optical depth 7 is one, are 
related to those at matter-radiation equality by 
Ay H, 
rec = 3.13 f rec 
aeq eq 


= 0.147 . (32.40) 


The Hubble distance today relative to those at matter-radiation equality and at recombination are 


“= 46.0 =21.1—— (32.41) 
ao Ho E l GeqHeq g l @recHrec l f 


In cosmology, distances are commonly reported in units of h~! Mpc, or, if the Hubble parameter h today 
is considered to be known, in Mpc. The Hubble distance today is 


—“_ = 9997.92458 h~! Mpc = 4.43Gpe . (32.42) 
ao Ho 
The horizon distance today is 
no = 147 = 3.20— = 9600 h`! Mpc = 14.2Gpc . (32.43) 
Qeqtteq aolto 
The age of the Universe today is, equation (10.15), 
to = 0.955 Hp = 13.8 Gyr . (32.44) 


32.4 The photon-baryon fluid in the tight-coupling approximation 


Prior to recombination, non-relativistic electron-photon (Thomson) scattering kept photons tightly coupled 
to electrons, and Coulomb scattering kept electrons tightly coupled to baryons (nuclei, mostly protons and 
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helium ions). Thus photons and baryons/electrons behaved as a single tightly-coupled fluid. The baryonic 
fluid contributed negligible pressure to the combined baryon-photon fluid, but it contributed a finite energy 
density that became increasingly important as recombination approached. The density loading decreased 
the sound speed of the photon-baryon fluid, equation (32.50), to the point that at recombination the sound 
speed was about 80% of the negligible-baryon sound speed of /1/3. 

Electron-photon scattering transfers momentum between the baryonic fluid and photons, but it does 
not transfer energy, since the baryons and electrons, being non-relativistic, have negligible pressure, so their 
energy is just that of their rest mass. Consequently the energy conservation equation (30.13a) holds separately 
for each of the photon and baryon fluids. However, the exchange of momentum means that the momentum 
equation (30.13b) does not hold separately for each fluid. Rather, electron-photon scattering couples the 
fluids so that their bulk velocities are the same to a good approximation, 


Vb = Vy , (32.45) 


the photon bulk velocity being related to the photon dipole by v, = 30, equation (30.15). The approxima- 
tion (32.45) is called the tight-coupling approximation. The right panel of Figure 32.1 illustrates that the 
equality (32.45) of baryon and photon bulk velocities holds up to recombination, but then breaks down as 
the scattering mean free path becomes large, and baryons and photons are released from each other’s grasp. 

Define R to be 3 the baryon-to-photon density ratio, 
3p a _ 3gpQb 


b 
R= = Rı—, R 
4p, Geq 80m 


0.2, (32.46) 


with gp = 3.36 being the energy-weighted effective number of relativistic particle species at around the 
time of recombination, equation (10.152b). The energy flux of the combined photon-baryon fluid is, from 
equation (30.9b), 


fy + fo = (By + By) Vy + Povo = 3PyV(1 + R) , (32.47) 


where v is the common bulk velocity of the photon-baryon fluid. The equation of momentum conservation of 
the combined baryon-photon fluid is then a sum of the photon-velocity equation (30.13b) with w = 1/3, and 
R times the baryon-velocity equation (30.13b) with w = 0. The resulting momentum conservation equation 
is 
. a k 
(1+ R)v+ Ryt z0 =-k(14+R)v. (32.48) 


Combining the photon energy conservation equation (30.13a) with the momentum conservation equation (32.48), 
and substituting 6, = 309, equation (30.15), yields 
d? R àd k? k? 
e era (@ - 8) = 
dn 1+Radyn 3(1+R) 3(1+ R) 


(1 +R)Y +9] , (32.49) 


which coincides with equation (30.14) for w = 1/[3(1 + R)], and which goes over to the earlier radiation 
equation (30.48) in the limit R — 0 of negligible baryons. The term proportional to the first derivative d/dy 
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on the left hand side of equation (32.49) is an adiabatic damping term. In the absence of this term, and in 
the absence of a driving potential, equation (32.49) would reduce to a wave equation with sound speed 


1 
3(1+ R) © 


Cs = 


(32.50) 
The coefficient of the adiabatic damping term in equation (32.49) is, given that R « a, equation (32.46), 
nasa) (32.51) 


The sound horizon distance 7, is defined to be the distance travelled by a sound wave since the initial 
time 7 = 0, 


Ns = Ja dn . (32.52) 
0 
Recast in terms of the sound horizon distance ns, the differential equation (32.49) is 
af a 1 ie. iy a Rye (32.53) 
dn? ¢s dns i 7 


where prime ’ denotes derivatives with respect to the sound horizon distance, c = dcs /dns- 


32.5 WKB approximation 


Equation (32.53) is an equation for a forced, damped harmonic oscillator. The forcing terms are those on 
the right hand side of the equation, while the damping term is the first derivative term on the left hand 
side. There is a general method, called the WKB approximation (Wentzel, 1926; Kramers, 1926; Brillouin, 
1926), to obtain the homogeneous solutions for a damped harmonic oscillator when the damping rate is small 
compared to the frequency. 

Denote the coefficient of the damping term by 2k. The homogeneous version of equation (32.53) is then 


d? d 
2 k? ) (Qo —®) =0. 32.54 
(ag tg, +) Oo si 
In case being considered, the damping rate x is the adiabatic rate 
1c, 
S28 2; 
K De,” (32.55) 


but the WKB method works for more general «K, provided that « is small compared to the wavenumber of the 
sound wave, k < k. The homogeneous wave equation (32.54) can be solved approximately by introducing a 
frequency w defined by 


Oo — Bax eles | (32.56) 
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The homogeneous wave equation (32.54) is then equivalent to 
w +w? + 2kw +k = 0. (32.57) 


To the extent that the damping parameter is much smaller than the frequency, k < k ~ w, the frequency w 
is approximately constant, so that w’ can be neglected in equation (32.57). With w’ neglected, the solution 


of equation (32.57) is 
w=-—k + iyk?-— rk? x -rk ik, (32.58) 


where the last approximation holds because « < k. Equation (32.58) is called the WKB approximation. 


Thus the homogeneous solutions of the wave equation (32.54) are approximately 


Oo — Bax e7 S" dns Eikns | (32.59) 


32.5.1 Radiation in the tight-coupling approximation 
In the tight-coupling approximation, the damping rate « in the differential equation (32.54) is the adiabatic 


damping rate (32.55). The integral of the adiabatic damping term is fra dns = — 4 ln cs, whose exponential 
is 
e7 Sradns — fē. (32.60) 


In the WKB approximation, the homogeneous solutions to the wave equation (32.54) are 


Oo — © x yes eth” (32.61) 


This shows that, as the sound speed decreased thanks to the increasing baryon-to-photon density in the 
expanding Universe, the amplitude of a sound wave decreased as the square root of the sound speed. 


32.6 Including quadrupole pressure in the momentum conservation equation 


The tight-coupling approximation treats the photon-baryon fluid as a perfect fluid, that is, the pressure is 
taken to be isotropic in the fluid frame. A better approximation is to allow the photons a small quadrupole 
anisotropy, which allows diffusive dissipation, §32.7. 

The scalar part of the momentum conservation equation (29.44b) in general depends not only on the 
isotropic pressure p, but also on a traceless quadrupole pressure. Let the dimensionless scalar quadrupole q 
be defined by its relation to the trace-free quadrupole component of the energy-momentum tensor, 

Te! = (p+P)q (kako — 45a) , (PPa = (Rako — $5.) T”. (32.52) 
For relativistic species such as photons, the dimensionless quadrupole q is related to the quadrupole moment 
©2 by, equation (33.53d), 


q=- 202. (32.63) 
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In the presence of a quadrupole pressure, the momentum conservation equation (29.44b) includes a term 


1 
DmT™ = -(p + p)Vaq - (32.64) 
quad a 
The net effect is to modify all momentum conservation equations by replacing Y — Y + q. The scalar bulk 
velocity equation (30.13b) is thus modified to 


oe Bw) V+ wkd = —k(Y + q) . (32.65) 


32.7 Photon diffusion (Silk damping) 


The tight coupling between photons and baryons is not perfect, because the mean free path for electron- 
photon scattering is finite, not zero. The imperfect coupling causes sound waves to damp at scales comparable 
to and below the mean free path. The damping is greater at smaller scales, leading to a systematic reduction in 
CMB power at smaller scales by an approximately Gaussian factor, equation (32.84). The damping reduces 
power, but it does not smooth out the acoustic oscillation structure of the CMB power spectrum, which 
remains intact. 

For photon multipoles £ > 2, the electron-photon scattering term on the right hand side of the photon 
Boltzmann hierarchy (33.81) acts as a damping term that tends to drive the multipoles exponentially into 
equilibrium (the solution to the homogeneous equation Ò; + |+| © = 0 is a decaying exponential). As seen 
in §32.4, in the tight-coupling approximation the monopole and dipole oscillate with a natural frequency of 
w = csk, where cs is the sound speed. These oscillations provide a source that propagates upward to higher 
harmonic numbers £. For scales much larger than a mean free path, k/|7| < 1, the time derivative is small 
compared to the scattering term, |O| ~ c.k|O| < |*O], reflecting the near-equilibrium response of the higher 
harmonics. For multipoles £ > 2, the dominant term on the left hand side of the Boltzmann hierarchy (33.81) 
is the lowest order multipole, which acts as a driver. Solution of the Boltzmann equations (33.81) then requires 
that 

Oni ~ =" for L> 2. (32.66) 
The relation (32.66) implies that higher order photon multipoles are successively smaller than lower orders, 
lO] < |©¢|, for scales much larger than a mean free path, k/|7| < 1. This accords with the physical 
expectation that electron-photon scattering tends to drive the photon distribution to near isotropy. 

To lowest order, dissipation can be taken into account by including the photon quadrupole ©% in the 
Boltzmann hierarchy (33.81) of photon multipole equations, but still neglecting the higher multipoles, ©; = 0 
for l > 3. According to the estimate (32.66), this approximation is valid for scales much larger than a 
mean free path, k/|7| < 1. The approximation of truncating at the quadrupole is equivalent to a diffusion 
approximation. In the diffusion approximation, the photon quadrupole equation (33.81c) reduces to 


Ak 


Oz = — — 
” gjë] 


©. (32.67) 
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Substituted into the photon momentum equation (32.8b), the photon quadrupole 02 (32.67) acts as a source 
of friction on the photon dipole ©;. In hydrodynamics of near-equilibrium fluids, such a quadrupole moment 
is called shear viscosity. When polarization is included, which modifies the factor on the right hand side 
of the photon quadrupole equation (33.81c), the factor 5 in equation (32.67) is increased by a factor of £ to 
4, equation (35.72), as already adopted in equations (32.9). 

The diffusive damping resulting from a small photon quadrupole ©2 conserves the energy and momentum 
of the photon fluid (by itself, irrespective of baryons), so that covariant momentum conservation DmT™” = 0 
continues to hold true within the photon fluid. The contribution of a quadrupole pressure to the momentum 
conservation equation was discussed in §32.6. 


32.8 Viscous baryon drag damping 


A second source of damping of sound waves, distinct from the photon diffusion of §32.7, arises from the 
viscous drag on photons that results from a small difference vp — 30; between the baryon and photon bulk 
velocities. In contrast to photon diffusion, viscous baryon drag transfers momentum between photons and 
baryons. In hydrodynamics of near-equilibrium fluids, this effect is called heat conduction. 

An expression for the bulk velocity difference vp — 30, follows from either of the momentum conservation 
equations (32.8b) or (32.7b) for photons or baryons, 


Vp — 301 = Bs 0, + -00 + = a Vp + ee +kWU) x an O©ı + 1o +—W). (32.68) 
|t| 3 3 \7| a iż] a 3 


The bulk velocity difference vp—30;1 is small because the scattering factor |7| is large. The final approximation 
of equations (32.68) follows from replacing vp with 30, to lowest order, which is valid because the expression is 
already of linear order. Taking a linear combination of the second and fourth expressions in equations (32.68) 
so as to eliminate 0, gives 


3R k a 
— 30, = (S) 0.) . 32.69 
vy — 30; Tl = :) (32.69) 
On the right hand side of equation (32.69), the wavenumber k is large compared to å/a at the subhorizon 
scales where dissipation is important, so the bulk velocity difference reduces to 


k 


(1+ R) 

It is tempting to insert the approximation (32.70) directly into the right hand sides of the photon and 
baryon momentum conservation equations (32.8b) and (32.7b), but the result is not of the desired precision, 
since the right hand sides of the momentum equations are multiplied by the large factor |+|, amplifying 
imprecision in the approximation (32.70). A precise approach is to start with the equation of conservation of 
total momentum of the photon-baryon fluid, which is a sum of the momentum conservation equations (32.8b) 
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and (32.7b) for photons and baryons, 


. k k i 
0,4 (Qo 202) + —W 4 a Vp + cm +ký]=0. (32.71) 
3 3 3 a 


Rewriting the baryon velocity as the photon velocity plus a small difference, vp = 301 + (vp — 301), brings 
the momentum conservation equation (32.71) to 


: k k f 4 k Rfd | 
01 + = (Oo — 202) + =V+R Ò: +20, + ŻY ) + p. (vp — 301) = 0. (32.72) 
3 3 a 3 3 \dn a 
The term in equation (32.72) involving the velocity difference vp — 30, is proportional to 
d à d a Rk Rk . Rk? 
Uo —30,)= ooa eae 2 Ve pe = 32.73 
(ata) 200 = (543) aeRO” To AÀ ale 


the second step of which invokes the approximation (32.70), and the last two steps of which retain only the 
dominant term at the subhorizon scales kn >> 1 where dissipation is important. 

Substituting the approximation (32.73), and the diffusive approximation (32.67) for the radiation quad- 
rupole Og, brings the photon-baryon momentum conservation equation (32.72) to 


(1+ R) +R yiip +(14+ Re) +E gee 0, =0 (32.74) 
ye gee ee aan 3¢|\9 114R] °° 
The final terms proportional to the comoving Thomson mean free path 1/|7| on the left hand side of equa- 
tion (32.74) are the dissipative terms. The 8/9 term is from photon diffusion, while the R?/(1+ R) term is 
from baryon drag. 


32.9 Photon-baryon wave equation with dissipation 


Eliminating the dipole ©; in equation (32.74) in favour of the monopole Oo using the photon monopole 
equation (32.8) yields a second order differential equation for Op — ®, 


d? R à k? 8 R? d k? k? 
H H : H Oo- ) = 1+R)V+ 9] . 
(ap oa 30ER (; 9) dn e eS ee 
(32.75) 
Recast in terms of the sound horizon distance 7, defined by equation (32.52), the wave equation (32.75) 


becomes 


d? d kes (8 R2 d 
} s sS Hk? ©) = k2 14 Y © 9. 
(ag | ec F EEn] dns be ) [(1 + R)Y + 9], (32.76) 


where prime ’ denotes derivative with respect to sound horizon distance, c, = dcs/dns. Equations (32.75) 
and (32.76) differ from the earlier dissipation-free equations (32.49) and (32.53) by the inclusion of dissipation 


872 Cosmological perturbations: the hydrodynamic approximation 


terms proportional to the Thomson scattering mean free path lr = 1/|+|. WKB solution of equations such 
as (32.76) was discussed in §32.5. 

In Exercise 35.7 it is found that polarization increases the photon diffusion contribution in equation (32.76) 
by a factor of £ from 5 to E, 


8 16 
— —. 2. 
9 > T (32.77) 


The terms proportional to the linear derivative d/dns in equation (32.76) are damping terms, which may 
be collected into an overall damping coefficient ~, 


d? d 
— +2k—— +k? —6)=—-k/(1 V+). 2. 
(i+ nTn t ) © ) [1 +R)Y +8] (32.78) 
The damping coefficient « is a sum of adiabatic «ka and dissipative Kg parts, 
1 dln cs k?cs /16 R? 
= Ka ; a ; = s t 2. 
ematia Mae p a S (3 i+ z) Cem) 


In the dissipative damping coefficient ka, the 16/15 term arises from photon diffusion, while the R?/(1 + R) 
term arises from baryon drag. At recombination, where R ~ 0.6, dissipation by photon diffusion and baryon 
drag are in the ratio (16/15)/[R?(1 + R)] ~ 5. Thus photon diffusion dominates the dissipation, but baryon 
drag contributes non-negligibly. 

In the WKB approximation, §32.5, the homogeneous solutions of equation (32.78) are 


Oo — B x Vege” J "a dns etibns | (32.80) 


The dissipative factor e7 J ®a4ns involves an integral of the dissipative damping coefficient over the sound 
horizon distance, which may be written 


2 
fra dns = ua l (32.81) 
ka 


where k7" is the damping scale defined by, from equation (32.79) along with the definition (32.52) of 7, and 
the relation (32.50) between c, and R, 


1 c, (16 R 1 16 F 
=|“ | dn, = dn . 32.82 
kZ la & =z) 1 E Enn (3+ sR) " ee 


The damping scale i is roughly the geometric mean of the scattering mean free path lr and the horizon 


distance 7, as might be expected for a random walk by increments lr over a time 7, 
ky) ~ Ven. (32.83) 
The resulting dissipative damping factor is 
e7 fradi — e-h? /kå (32.84) 


Thus the effect of dissipation is to damp temperature fluctuations exponentially at scales smaller than the 
diffusion scale kg. The diffusion scale ką is evaluated in Exercise 32.6. 
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The driving potential on the right hand side of the wave equation (32.78) causes 0, — ® to oscillate not 
around zero, but rather around the offset — (a +R)Y + l. At scales well inside the sound horizon, kns > 1, 
this driving potential also varies slowly compared to the wave frequency. To the extent that the driving 
potential is slowly varying, the complete solution of the inhomogeneous wave equation (32.78) well inside 
the horizon is 


Oo + (1 + R)Y x Vege P/ha etikns (32.85) 


As will be seen in Chapter 34, equation (34.17), the monopole contribution to CMB fluctuations is not the 
photon monopole Og by itself, but rather Og + Y, which is the monopole redshifted by the potential Y. This 
redshifted monopole is 


Oo +Y = — RỌ + A/c; e7 t/i etikns | (32.86) 


with some constant amplitude A. Thus the redshifted monopole ©ọ + W oscillates about the offset —RW. 
Physically, the gravity of baryons enhances sound wave compressions while weakening rarefactions. The offset 
of the redshifted temperature monopole translates into an amplification of compression (odd) peaks in the 
CMB, and a weakening of rarefaction (even) peaks in the CMB, as is observed in the CMB. 


Exercise 32.5. Behaviour of radiation in the presence of damping. Confirm that, for k < k, the 


homogeneous solutions of equation (32.78) are approximately Og — ® œx e7 Jedns+tikns | Hence find the 
retarded Green’s function, and write down the general solution to equation (32.78). Convince yourself that 
Oo — © is a decaying wave that oscillates around —[(1+ R)W + 9]. 

Solution. The general solution of equation (32.78) is, with y = kns and 8 = fr dns, 


Ooly) — (y) = e7’ (Ao cosy + Ai siny) — fw + Ry’) Vy’) + B(y')}e B-F) sin(y — y’) dy’ , (32.87) 


where Ap and A, are constants. 


Exercise 32.6. Diffusion scale. Show that the dimensionless ratio of the damping scale ką defined 
by (32.82) to the comoving Hubble distance c/(a@eqHeq) at matter-radiation equality is given by 


nee _ 8V2rGmy On A (a/acq)? & R? 
0 Xe 


E d(a/aeq) - 32.88 
cek? 9cor f+Heq Ob 1+ (a/aeq)(1 + R) 15 =z) ( / a) ( ) 


If hydrogen is taken to be fully ionized and helium neutral, which is a reasonable approximation in the run-up 
to recombination, then Xe = fy. For constant Xe, the integral on the right hand side of equation (32.88) 
can be done analytically. With a normalized to aeq = 1, 


16 (1 4 R) + FR? a ad 
gee ee dax f a (32.89) 
0 


FOS aa ay Ek 
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The last approximation is correct to order unity for any a. Conclude that, neglecting the effect of recombi- 
nation on the electron fraction Xe, 


aegHea  6.83h-* Hy Om 
ake fa fa Heq Qp 


the final value being the approximate value at recombination. 


f(a/aeq) = 6 x 1074 f (a/aeq) = 0.0035 , (32.90) 


32.11 Neutrinos 


Before electron-positron annihilation at temperature T ~ 1 MeV, weak interactions were fast enough that 
scattering between neutrinos, antineutrinos, electrons, and positrons kept neutrinos and antineutrinos in 
thermodynamic equilibrium with baryons. After eë annihilation, neutrinos and antineutrinos decoupled, 
rather like photons decoupled at recombination. After decoupling, neutrinos streamed freely. In Exercise 32.7 
you will show that, in an approximation developed in §34.6.2, the effective sound speed in neutrinos was 
about the speed of light, in contrast to photons where collisional isotropization leads to a sound speed about 
1/V3 the speed of light. 


Exercise 32.7. Generic behaviour of neutrinos. Insert the approximate value (34.50) of the neutrino 
quadrupole M2 into the neutrino energy and momentum conservation equations (32.10) to obtain the differ- 
ential equation 


(5 +25 +0) (No — ®) = —k2(W + 8) . (32.91) 


What kind of equation is this? What are the its solutions? Find the Green’s function solution driven by a 
prescribed potential Y + ®, subject to the initial condition that No — ® = ¢,. Convince yourself that Mo — ® 
is a decaying wave that oscillates around —(W + ©). Exercise 35.8 generalizes this exercise to the case of 
vector and tensor fluctuations. 

Solution. The Green’s function solution of equation (32.91) is with y = kn, 


/ 


Sh f wy) ou sinty = Ea. oe) 


No -®=4, 
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Cosmological perturbations: Boltzmann 
treatment 


Chapters 30 and 32 treated cosmological perturbations in the approximations that matter and radiation 
behaved as respectively perfect and imperfect fluids. The fluid approximation truncates the momentum dis- 
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Figure 33.1 (Left) Overdensities 6 — 36, and (right) bulk velocities v in a Boltzmann treatment as a function of 
cosmic scale factor a/deq, at wavenumber k/(deqHeq) = 10, for non-baryonic dark matter (c), baryons (b), photons 
(y), and neutrinos (v). The cosmological model is the standard model adopted in this book, a flat ACDM model 
with concordance parameters Qa = 0.69 and Qm = 0.31 and adiabiatic initial conditions, §32.3. The overdensities 
and velocities of relativistic species are related to their monopole and dipole moments by 6, — 3® = 3(@o — 9), 
dp — 38 = 3(No — ®), vy = 301, vy = 3M1. The computation shown here includes photon and neutrino multipoles 
up to fmax = 32. Compare these results to the simple and hydrodynamic computations, Figures 30.2 and 32.1. 
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Figure 33.2 (Left) Photon multipoles up to £ = 6, and (right) neutrino multipoles up to £ = 32, as a function of cosmic 
scale factor a/aeq, at wavenumber k/(aeqHeq) = 10. The cosmological model is the same as in Figure 33.1, §32.3. The 
thick (black) line shows — Y — ®, about which the photon and neutrino monopoles Oo — ® and No — ® mostly oscillate 
(except near recombination, where the photon monopole Qo — ® oscillates about — (1+ R)Y — ®). The computation 
includes photon and neutrino multipoles up to max = 32. The unphysical jitter in the modes for a/aeq Z 10 is a 
symptom of the computation ceasing to be reliable once multipoles higher than those computed become significant. 
The multipoles may be compared to those in the hydrodynamic approximation, Figure 32.2. 


tribution at the quadrupole momentum moment. However, higher order multipole moments of the photon 
distribution become important near recombination, and a fully satisfactory treatment of the CMB requires 
following these moments. The evolution of the complete set of multipole moments is governed by the colli- 
sional Boltzmann equation. 


A Boltzmann treatment is needed in any case to determine how the Boltzmann equations should best be 
truncated to give the hydrodynamic treatment of Chapter 32. The purpose of the present Chapter is to give 
an account of the Boltzmann equation as it applies to cosmological perturbation theory. 


Figure 33.1 shows the overdensity and bulk velocity of the 4 species, non-baryonic dark matter, baryons, 
photons, and neutrinos, calculated in the Boltzmann treatment of this Chapter, as a function of cosmic scale 
factor, at an illustrative wavenumber k/(deqHeq) = 10. Figure 33.2 shows photon multipoles up to £ = 6 and 
neutrino multipoles up to @ = 32 in a Boltzmann computation that include multipoles up to max = 32 for 
both photons and neutrinos. 
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The Boltzmann treatment uses the Boltzmann hierarchy of equations to follow the evolution of multipole 
moments of relativistic species, photons and neutrinos, up to some maximum harmonic max— 1. The hierarchy 
is truncated by invoking a suitable approximation for the max th harmonic. The Boltzmann treatment yields 
the hydrodynamic approximation, §32.2, when max = 2. 

In the Boltzmann treatment, the equations for non-baryonic cold dark matter (c) and baryons (b) are the 
same as those in the hydrodynamic approximation, equations (32.6) and (32.7), 


å. — kve -36=0, (33.1a) 
vet oveth¥=0, (33.1b) 
and 
õp — kw — 38 =0, (33.2a) 
vp + “vy tk¥= a (vp — 301) . (33.2b) 


The equations for photons (y) are given by the Boltzmann hierarchy (33.81), 


0, -kO, —@=0, (33.3a 

. k k Tai 
91 + z (Oo — 202) + gu = 371 (vp — 381) j (33.3b 

: k 3 
O2 + 5 (20, = 303) = -7t Oo 4 (33.3¢ 

: k 
O; + — KO — (€+1)0 = -|H ©; @>3). 33.3d 
et op qq Ora — (6+ 1)Geri] = —I718¢ 2 8) ( 

As commented after equations (33.81), the factor 3 in equation (33.3c) includes the effect of polarization; 
without polarization, the factor is 2. The fmax th harmonic O% „a, may be approximated by an expression 


that interpolates between the tight-coupling limit |t| > ks, equation (33.83), and the free-streaming limit 
|t| << ks, equation (33.84), 


1 |7/? tight | free 
Orme = TET (ar Ole + Oka) Ae) 
1+ 460,,..2) lmaxk max — 1 
Qtight = ( 3 Cmax 2) © efree =o (©; = gai 20) am o; (33.4b) 


Linar (Qa. + 1)|7] max—l > brag max max kn max—l - 
Equations (33.4) reduces to the hydrodynamic approximation (32.9) when émax = 2. As in the hydrodynamic 
case, numerical experiment indicates that the interpolation constant ks is adequately approximated by ks % 
2drecHrec (or ks © GeqHeq, for standard ACDM cosmological parameters). The equations for neutrinos (v) 
are given by a Boltzmann hierarchy (33.91) which looks like that for photons, but without the scattering 
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terms, 
No-kM, -&=0, (33.5a) 
: k k 
Ni + 3 Mo-2M)+3%=0, (33.5b) 
. k 
No + = [We — (L+ 1)Na]=0 (€> 2). (33.5c) 
22+1 
The max th harmonic Me „a may be approximated by, equation (33.92), 
2lmax — 1 
Nomar == (Wina F dtaa D) = r Niat » (33.6) 
The Einstein energy and quadrupole pressure equations are 
-k -3ÎF= AnGa? (Pec + Podp + 48790 + 45M0) , (33.7a) 
a 
k?(W — ®) = —327Ga?(p,O2 + prN2) , (33.7b) 
where F is defined by equation (30.56). 
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Figure 33.3 (Left) Overdensities 6, — 3% and ôp — 3® of non-baryonic dark matter (brown) and baryonic matter 
(green), and (right) radiation monopole Qo — 3® (blue), and minus the sum of the scalar potentials, — (Y + ®) (black), 
as a function of cosmic scale factor a. Curves are labelled with the comoving wavenumber k/(aeqHeq) in units of the 
Hubble distance at matter-radiation equality. The cosmological model is as in §32.3. Compare these results to the 


simple and hydrodynamic computations, Figures 30.1 and 32.3. 
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Exercise 33.1. Program the Boltzmann equations. Upgrade the code you wrote in Exercise 32.2 to 
implement the system of Boltzmann equations (32.6)—(32.13). Initial conditions for neutrinos, and for the 
two scalar potentials V and ©, are derived in Exercise 33.5. Explore the evolution of the 2 scalar potentials 
and of the 4 species of mass-energy, non-baryonic dark matter, baryons, photons, and neutrinos. 

Solution. See Figures 33.1, 33.2, 33.3 and 33.4. The computations here included photon and neutrino 
multipoles up to (max = 32. 
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Figure 33.4 Difference Y — ® in scalar potentials as a function of cosmic scale factor a. The cosmological model is the 
same as in Figure 33.1, §32.3. Curves are labelled with the wavenumber k/(deqHeq) in units of the Hubble distance at 
matter-radiation equality. The difference Y -— ® is sourced principally by neutrino anisotropy before recombination, and 
by photon and neutrino anisotropy after recombination. The computation includes photon and neutrino multipoles 
up to max = 32. 


Exercise 33.2. Power spectrum of matter fluctuations: Boltzmann treatment. Upgrade the code 
you wrote in Exercise 32.3 to compute the power spectrum of matter fluctuations in a Boltzmann computa- 
tion. 


Solution. See Figure 33.5. 
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Figure 33.5 Model matter power spectrum computed from a Boltzmann computation, compared to observations from 
the North (N) and South (S) Galactic Caps of the the Sloan Digital Sky Survey IV (Gil-Marín et al., 2020). The 
cosmological model is the same as in Figure 33.1, §32.3. The model power spectrum may be compared to those 
computed in the simple and hydrodynamic approximations, Figure 30.15 and Figure 32.4. The thin (pink) line is the 
model power spectrum in the hydrodynamic approximation. 


33.2 Boltzmann equation in a perturbed FLRW geometry 


The Boltzmann equation was introduced in §31.5. The left hand side of the Boltzmann equation (31.32) is, 
for either massless or massive particles, 


d _ 
dA 


dpe ðf aa e dô Of dpôðf 
dn Bpt trat 7 ap d\Op’ 


Dp” Omf + (33.8) 


Here À is an affine parameter along the worldline of a particle, and p™ = {F, p} is the tetrad-frame momentum 
of the particle. Both dp/dA and Of /Op vanish in the unperturbed background, so dp/dA- Of /Op is of second 
order, and can be neglected to linear order, so that 


df 


a T Poof + plOaf + 


dp Of 


Bp (33.9) 
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The expression (33.9) for the left hand side df/d\ of the Boltzmann equation involves dp/dA, which in 
free-fall is determined by the usual geodesic equation 


Since E?—p? = m?, it follows that the equation of motion for the magnitude p of the tetrad-frame momentum 
is related to the equation of motion for the tetrad-frame energy E by 


dp E dE 


eS Bae 33.11 
fax aX oe 
The equation of motion for the tetrad-frame energy E = p° is 
dE 
Fang Ihn POP” =Toa0 p E + Toad P p . (33.12) 
From this it follows that 
dlop EdE Ep? zab å Ep’ na abt 
— = —— [= E | —Toa0 + ppo | = E Poao + °P Dow. | , 33.13 
dÀ p2 dà (= 0a0 TP P £ 0ab a2 p 0a0 TP P + dab ( ) 


where in the last expression the tetrad connection Toap, equation (29.24b), has been separated into its 
unperturbed and perturbed parts —(å/a?)ðap and Toab- 

In practice, the integration variable used to evolve equations is the conformal time 7, not the affine 
parameter À. The relation between conformal time 7 and affine parameter A is 


dn T 1 m n n T m 1 a 
T TP = Emp” = (Öm + Pm” )JEn"p™ = — [E(L — 00) — p° pao] (33.14) 
whose reciprocal is to linear order 
dX a p° 
= 14 ao) 33.15 
d) E ( Yoo EY o) ( ) 


With conformal time ņ as the integration variable, the equation of motion (33.13) for the magnitude p of 
the tetrad-frame momentum becomes, to linear order, 


dlnp a p“ Ep? Aa ih 
~ (1 + Yoo + van aloco + p ptaLoab . (33.16) 
n a E p 
With the collision term restored, the Boltzmann equation (31.32) expressed with respect to conformal 
time 77 is 

df of dinp Of dX 
— = — Vaf 4 =—C ; 33.17 
dn On mee f dn Olnp dn if] ( ) 


a — 


where v* = p° /E is the tetrad-frame particle velocity, and d\/dy and dln p/dn are given by equations (33.15) 
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and (33.16). Expressions for d\/dy and dln p/dn in terms of the vierbein perturbations in a general gauge 
are left as Exercise 33.3. In conformal Newtonian gauge, the factor d\/dn, equation (33.15), is 


dà a 
— = —(1 + Y). 33.18 
a purty) (33.18) 
In conformal Newtonian gauge, and including only scalar fluctuations, the factor dln p/dn, equation (33.16), 
is 
dl 4 . pt 
Seb 7 (33.19) 
dn a p 


To unperturbed order, the Boltzmann equation (33.17) is 


0 0 0 
df Of à ðf a q? 
dn On aðlnp E CIA], (Pan 


0 


where C[f] is the unperturbed collision term, the factor a/E coming from dà/dn = a/E to unperturbed 
order, equation (33.15). The second term in the middle expression of equation (33.20) reflects the fact that 
the tetrad-frame momentum p redshifts as p x 1/a as the Universe expands, a statement that is true for 
both massive and massless particles, equation (10.68). 

Subtracting off the unperturbed part (33.20) of the Boltzmann equation (33.17) gives the perturbation of 
the Boltzmann equation 


a of af a 


ole: ainp peU + 2 cll , (33.21) 


where dln p/dņ multiplying af /d Inp has been replaced by —a@/a to linear order, equation (33.16), and G 
(not to be confused with the Einstein tensor) defined by 


dln(ap) 
dn 


(33.22) 


expresses the peculiar gravitational redshifting of particles. In conformal Newtonian gauge, and including 
only scalar fluctuations, the gravitational redshift term G is 


Ep 


G=6- 7 Viv. (33.23) 


In conformal Newtonian gauge, the perturbed part of d\/d7y which appears on the right hand side of the 
Boltzmann equation (33.21) is 
1 
dà a 
— = —Ų. 33.24 
aR (33.24) 
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Exercise 33.3. Boltzmann equation factors in a general gauge. Show that in a general gauge, and 
including not just scalar but also vector and tensor fluctuations, equation (33.15) is 
dX a 


De a ae 
T Í +Y += (Vaŭ + a) ; (33.25) 


E 


while equation (33.16) yields the gravitational redshift term 


_din(ap)_ , | Ep* (9 am CAR 
G= a =ġ4 i | Vay +4 (in (VawW + Wa) 


+ pp? [- VaVo(w — h) — 4 (Va Wo + VoWa) + Voia + has] ' (33.26) 


33.3 Non-baryonic cold dark matter 


Non-baryonic cold dark matter is by assumption non-relativistic and collisionless. The unperturbed mean 
density is Je, which evolves with cosmic scale factor a as 


Dexa Ta (33.27) 


Since dark matter particles are non-relativistic, the energy of a dark matter particle is its rest-mass energy, 
E: = Mme, and its momentum is the non-relativistic momentum pt = mevê. 

The energy-momentum tensor T” of the dark matter is obtained from integrals over the dark matter 
phase-space distribution fe, equation (10.121). The energy and momentum moments of the distribution 
define the dark matter overdensity 6, and bulk velocity ve, while the pressure is of order v2, and can be 
neglected to linear order (note the different fonts for particle velocity v and bulk velocity v), 


edp 
7700 = JR Me oy = (L+ ô.) , (33.28a) 
Dede = 
ian Be ee =pevi , .28b 
ab — ab Jepe _ 
TC = [if Me Ve Ve (nh)? = 0. (33.28c) 


Non-baryonic cold dark matter is collisionless, so the collision term in the Boltzmann equation is zero, 
C|f-] = 0, and the dark matter satisfies the collisionless Boltzmann equation 


dfe 
at: (33.29) 


The energy and momentum moments of the Boltzmann equation (33.17) yield equations for the overdensity 
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d¢ and bulk velocity ve, which in the conformal Newtonian gauge are 


_ [ He Ic dpo _ zf Jo dpe J a Je Pe I å a of Je A De 
=] dn ve (27h)? feme (2rħ)3 © Ve a (27h) a = dnp ° (27h) 
(1 _ 
„Ee va (Pe ve) +3(2-4) oe (33.30a) 
_ fhe... GeO de yè ae ayb I dpe 
= dn MeVe (2rh)3 = 5 f imag cU, Th AVe | temov © (anh)3 
a. Ep Of a Ge po 
i (¢ + A Pow) Bing MeV (nhs 
OpcVe his pt 
= +4(--6) pv +p.Vab. (33.30b) 
On a 


The dp.ve term on the last line of equation (33.30b) can be dropped, since the potential ® and the bulk 
velocity vl are both of first order, so their product is of second order. Subtracting the unperturbed part 
from equations (33.30a) and (33.30b) gives equations for the dark matter overdensity ôe and bulk velocity 
Ve, 


e+ V -ve -36=0, (33.31a) 


ve + Sve +VU=0. (33.31b) 


Transformed into Fourier space, and decomposed into scalar ve and vector v¢,, parts, the velocity 3-vector 
Ve iS 


vo = —ikve + Ve - (33.32) 


For the scalar modes under consideration, only the scalar part of the dark matter equations (33.31) is relevant: 


bo — kv, — 36 =0, (33.33a) 


vet Eve +kU =0. (33.33b) 


Equations (33.33) reproduce the equations (30.53) derived previously from conservation of energy and mo- 
mentum. 


Exercise 33.4. Moments of the non-baryonic cold dark matter Boltzmann equation. Confirm 
equations (33.30). 
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The full hierarchy of Boltzmann equations is needed only to describe relativistic species (photons and neu- 
trinos); non-relativistic species (non-baryonic dark matter and baryons) are described adequately by the 
equations of conservation of energy and momentum. Cosmological fluctuations in relativistic species are 
commonly characterized in terms of a temperature fluctuation O, 


ôT (N, £, p) 
T(n) 


In most of this book © refers to photons, but in this section the temperature fluctuation O refers to any 
species, bosonic or fermionic, massless or massive (neutrinos have small masses, §42.4.15). At early times, 


O(n, £,p) = (33.34) 


collisions drove the occupation number f into thermodynamic equilibrium at each comoving position x, so 
that initially the temperature fluctuation was a function O(7, x) only of time and position, not of particle 
momentum p. This explains why the preferred fluctuation variable is the temperature fluctuation O, and not 
the perturbation f of the occupation number; the latter depends on momentum p even in thermodynamic 
equilibrium. As collisions peter out, around recombination in the case of photons, and around eé-annihilation 
in the case of neutrinos, free-streaming allows the temperature fluctuation © to become anisotropic. 
For a relativistic species, the unperturbed occupation number in thermodynamic equilibrium is 
0 1 


where the sign is — for bosons (photons) and + for fermions (neutrinos). The unperturbed Boltzmann 
equation (33.20) can be recast as an equation for the background temperature T(n), 


dln(aT) a fait, af 


dn E if] OlnT ’ (33536) 


where it follows from equation (33.35) that (the partial derivative with respect to temperature 0/OlnT is 
at constant momentum p) 
0 
Of v o, p 
= f(1 : 
-ESE (33.37) 


0) 


with + and — for bosons and fermions respectively. Equation (33.36) shows that if the collision term C[f] 
vanishes, then the background temperature redshifts as T œ a~!. In practice, the collision term C[f] was 
negligible for both photons and neutrinos since the end of eé-annihilation. Photons continued to exchange 
energy with electrons and baryons, but the effect on the photons was negligible because they overwhelmingly 
outnumbered electrons and baryons, equation (10.103). Although the heating term dln(aT)/dn is negligible 
in the situation at hand, it is retained temporarily for completeness in the next paragraph. 

The definition (33.34) of the temperature fluctuation © is to be interpreted as meaning that the pertur- 
bation to the occupation number is 


pa a= of 


OlnT dnT° i (HESA 
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Two of the terms on the left hand side of the perturbed Boltzmann equation (33.21) rearrange to 


af dinp of af o+ (27 ð àa ) A 


T dn dnp ônT dn OlnT aðlnpj OlnT 


0 0 
= Of [a  din(aT) Oln(Of/OlnT) 
~— OlnT | 7 dn OlnT as (aaa 
The collision terms on the right hand side of the perturbed Boltzmann equation (33.21) are 
alt 10) 1 
a h dA o Of _ dln(aT) E dà 
poet Cul AinT [cle Pde aaja (33.40) 


0 


where C[f] has been eliminated in favour of dln(aT)/dņ using equation (33.36), and C[O] is the scaled 
collision term defined by 


-Tana 
c0] = F Clf] Aint ` (33.41) 
The perturbed Boltzmann equation (33.21) thus becomes 
0 a 
dO 00 dln(aT) ðln(ðf/Ə nT) dln(aT) E dà 
— z= _ a a = E .42 
iG on VaO -G+ di AlaT © = C[O] + eT (33.42) 


where the gravitational redshift term G gets a minus sign from at /a Inp= of /d InT. 

In practice the heating terms proportional to dln(aT’)/dy, though important during for example electron- 
positron annihilation, are negligible for both photons and neutrinos during the time before and through 
recombination when anisotropies in the CMB are developing. The Boltzmann equation (33.42) then reduces 
to 


a = ue +v°V,90-G=C[O] . (33.43) 
dn = On 
As long as the particles are relativistic, the particle velocity is one, v = 1; but equation (33.43) allows for a 
general non-unit velocity v to accommodate neutrinos, which have small masses. 
Fourier transforming the Boltzmann equation (33.43) over spatial position æ yields the Boltzmann equation 
for the Fourier components O(n, k, p) of the temperature fluctuation, 


ue oe ivkuO — G = C|] |, (33.44) 
dn On 


where y is the cosine of the angle between the wavevector k and the photon momentum p, 


=k- p. (33.45) 


In Fourier space, the gravitational redshift term G, in conformal Newtonian gauge and including only scalar 
fluctuations, is, equation (33.23), 


. ik 
Ge aq “Ey . (33.46) 
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33.5 Spherical harmonics of the temperature fluctuation 


It is natural to expand the (photon or neutrino) temperature fluctuation © in spherical harmonics. The 
various components of the energy-momentum tensor T™” are determined by the monopole, dipole, and 
quadrupole harmonics of the particle distribution. Scalar fluctuations are those that are rotationally sym- 
metric about the wavevector direction k, which correspond to spherical harmonics with zero azimuthal 
quantum number, m = 0. Expanded in spherical harmonics Ym (P), and with only scalar terms retained, the 


temperature fluctuation © can be written 


O(n, k, p) — >i V An (20 + 1) O(n, k, p) Yeo (p) 


RN 
ll 
j=) 


(—i)'(2€ + 1)@e(n, k, p) Pe(k - P) , (33.47) 


M: 


a 
ll 
o 


where P; are Legendre polynomials, §33.14. The choice of normalization of the scalar harmonics ©, is not 
the same as the traditional normalization © = ae Ovo Yeo, but is conventional in studies of the CMB. The 
factor of (—i)f makes ©, real, and the normalization factor removes square root factors in the Boltzmann 
hierarchy. The harmonics in the traditional and CMB conventions are related by Opo = (—i)*\/47(20 + 1) Op. 
The scalar harmonics ©, are angular integrals of the temperature fluctuation © over momentum directions 
P, 

O(n k.p) =i! | O(n, kp) Pilk- P) FE . (33.48) 


Expanded into the scalar harmonics O,(7,k,p), equation (33.47), the left hand side of the Boltzmann 
equation (33.44) in conformal Newtonian gauge is 


ia Òo —vkO, — 6, (33.49a) 
dn 

d ; k k 

Ii 6 6 2654 a. (33.49b) 
dn 3 3v 

do vk 

a = Oct a7, 7 LO (+18) (C2 2). (33.49c) 


33.6 The Boltzmann equation for massless particles 


The Boltzmann equation (33.44) and its harmonic expansion (33.49) are valid for massive as well as massless 
particles, to allow for neutrino masses. The case of massive neutrinos will be resumed in §33.13; but for the 
next several sections, particles (photons and neutrinos) will be taken to be massless. 

Photons are massless, and neutrinos (probably) have small enough masses that they can be treated as 
massless through recombination. The velocities of massless particles are always one, v = 1. For massless 
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particles, the Boltzmann equation for the temperature fluctuation © is equation (33.43) with v = 1. The left 
hand side of the Boltzmann equation expands in scalar harmonics ©, as equations (33.49) again with v = 1. 

As remarked at the beginning of §33.4, at early times photons and neutrinos have distributions in ther- 
modynamic equilibrium, as a result of which their initial temperature fluctuations O are independent of 
particle momentum p. For massless particles (v = 1) the left hand side of the Boltzmann equation (33.43) 
is independent of the magnitude p of the particle momentum. As will be seen in §33.8, equation (33.68), 
Thomson scattering leaves the magnitude p, of the photon momentum essentially unchanged. Consequently 
the temperature fluctuations O(n, k, p) of photons, and of neutrinos as long as they are relativistic, depend 
on the direction p but not magnitude p of the particle momentum, 


O(n, k, p) = O(n, k,p) for photons and relativistic neutrinos . (33.50) 


33.7 Energy-momentum tensor for massless particles 


1 
Perturbations T™ to the energy-momentum tensor of particles involve integrals (10.121) over the perturbed 
occupation number f. For massless particles, these integrals take the form, where F'(p) is some arbitrary 
function of the momentum direction p, 


0 
5) s dp of p g 4rp?dp r for, Ea p fori ) 1p 

a 0 F(p) — = 4p | OF (p 33.51 
[i fp F (8), mh) dinT p(2ah)3 ir ( ) 

in which the last a is true because 

0 
of 2 gAnp dp _ gAnp?d e 

=4 33.52 
nT” p(2rh)3 4f ip ee (27h) P3 ( ) 
which follows from Əf/ð nT = _af/a Inp and an integration by parts. The perturbation of the energy 


density, energy flux, monopole pressure, and quadrupole pressure of massless particles are then, with integrals 
over © converted to harmonics @, using equations (33.48), 


1 
T° =4pQo, (33.53a) 
ka T°" = —i4p®, , (33.53b) 
1 
$ Jap T” = $ p00 , (33.53c) 
(3 haky — 4 ban) T® = —4p0.. (33.53d) 


33.8 Nonrelativistic electron-photon (Thomson) scattering 
The dominant process that couples photons and baryons is electron-photon scattering 


etyoe’+y. (33.54) 
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The Lorentz-invariant mean amplitude squared for unpolarized non-relativistic electron-photon (Thomson) 
scattering from initial photon momentum p, to final momentum py of the same magnitude, py = py, is, in 
units c = h = 1, equation (?7), 
2 21 
(|M|*) = (8ra) (33.55) 
where a = e?/(hc) is the fine-structure constant. The unpolarized mean amplitude squared (33.55) is the 
polarized amplitude squared (35.54) averaged over polarization states of the incoming photon (that’s what the 
adjective mean refers to here), and summed over polarization states of the scattered photon. The differential 
cross-section doy /do! for unpolarized Thomson scattering into an interval do’ of solid angle about scattered 
photon direction p’ is related to the squared amplitude (|M|?) by, equation (?7), 
do M}? a? 14 (P - p)? 
T p Mea Py Py) (33.56) 


do (BnM)? m2 2 


The coefficient of the differential cross-section is, with units c and fh restored, 


2 2 2 
ah 9 e 
(=) == (5) | (33.57) 


where re = e?/m,c? is the classical electron radius. The total Thomson cross-section or is 


or = dod = —r2. (33.58) 


33.9 The photon collision term for electron-photon scattering 


Electron-photon scattering keeps electrons and photons close to mutual thermodynamic equilibrium, and 
their unperturbed distributions can be taken to be in thermodynamic equilibrium. The unperturbed photon 
collision term for electron-photon scattering therefore vanishes, because of detailed balance, Exercise 31.5, 


CIA] =0. (33.59) 


Thanks to detailed balance, the combination of rates in the collision integral (31.40) almost cancels, so can 
be treated as being of linear order in perturbation theory. This allows other factors in the collision integral 
to be approximated by their unperturbed values. 

The photon collision term for electron-photon scattering follows from the general expression (31.40). To 
unperturbed order, the energies of the electrons, which are non-relativistic, may be set equal to their rest 
masses, Fe = Me. Since photons are massless, their energies are just equal to their momenta, Ey = p,. The 
electron occupation number is small, fe < 1, so the Fermi blocking factors for electrons may be neglected, 
1 — fe = 1. These considerations bring the photon collision term for electron-photon scattering to, from the 
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general expression (31.40), 


2 d e dper dp 
Mel 2m)? Me(2T)? py (2T)? ` 
(33.60) 


The various integrations over momenta are most conveniently carried out as follows. The energy-conserving 


A T 5 [MBI fefy(l+ fy) + fe fy (1+fy)| (21) SD (De FPy — Pet py) 


integral is best done over the energy of the scattered photon y’, which is scattered into an interval do, of 
solid angle: 
dpy doy do 
E, (anys? (ny? P On? 
y’ 


[orsv(Be+ Ey Ev — Ey) (33.61) 


The approximation in the last step of equation (33.61), replacing the energy py of the scattered photon by 
the energy p, of the incoming photon, is valid because, thanks to the smallness of the combination of rates 
in the collision integral (33.60), it suffices to treat the photon energy to unperturbed order. As seen below, 
equation (33.66), the energy difference p,—p.: between the incoming and scattered photons is of linear order 
in electron velocities. 

The momentum-conserving integral is best done over the momentum of the scattered electron, which is e’ 
for outgoing scatterings e + y > e' + ’, and e for incoming scatterings e + y + e' +’. In the former case 
(ety eb"), 

dpe 1 


I (2)°65 (Pe + Py = Pe = Pr) z Or m (33.62) 


and the result is the same, 1/mMe, in the latter case (e +y + e'+ 7’). The energy- and momentum-conserving 
integrals having been done, the electron e’ in the latter case may be relabelled e. So relabelled, the combi- 
nation of rate factors in the collision integral (33.60) becomes 


= fefy(1 + fy) + fefy (1 + fy) = is\= fy T fy) è (33.63) 


Notice that the stimulated (fe fy fy’) terms cancel. The energy- and momentum-conserving integrations (33.61) 
and (33.62) bring the photon collision term (33.60) to 


1 2 dpe doy: 
Cif] = n JMP fa + fy) aT x l (33.64) 


The collision integral (33.64) involves the difference — f, + f, between the occupancy of the initial and 
final photon states. To linear order, the difference is 


0 


0 1 1 at — Dy 
= fy + fy =n f (py) aI f (py) = F(p») F Fy) = A a p 2y O(p,) a O(py) . (33.65) 
y 


The first term (py — py’ )/py arises because the incoming and scattered photon energies differ slightly. The 
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difference in photon energies is given by energy conservation: 


Dy — Py = Ee — Ee 


2Me 
= (py — Py’) (2Pe + Py — Py) 
2Me 
~ (p, — py): Pe (33.66) 
Y OE ns? 


the last line of which follows because the photon momentum is small compared to the electron momentum, 


2 
py vT w ŽE <p. (33.67) 


Me 

Because the photon energy difference is of first order, and the temperature fluctuation is already of first 
order, it suffices to regard the temperature fluctuation © as being a function only of the direction p, of the 
photon momentum, not of its energy: 


O(p,) ~ O(P.) . (33.68) 


The linear approximations (33.66) and (33.68) bring the difference (33.65) between the initial and final 
photon occupancies to 


0 
Of, la a \. Pe POE 
f ! fy = DaT 0, Py) ` Me O(p,) l aly) : (33.69) 
Inserting this difference in occupancies into the collision integral (33.64) yields 
0 
i, ty Of J ð aa Be ora . 2 dpe doy 
Cf] = 16rm2 ðôln T (M| )fe (p, Py) Me O(p,) + o(p) (27)3 Ar ’ (33.70) 
or, switching to C[O] defined by equation (33.41), 
—_ a 2 a a Pe m m 2 dpe do: 
CIE] = ee [MP (6 ay) 2 -0@,) + 00,)] See (33.71) 


The amplitude squared (|M|?), equation (33.55), is independent of electron momenta, so the integration 
over electron momentum in the collision integral (33.71) is straightforward. The unperturbed electron density 
ne and the electron bulk velocity ve are defined by 

o 2d e o Pe 2 dpe 
The = E 75.\3 ? Ne e= e . 33.72 
k | toms ei Jea G 
Coulomb scattering keeps electrons and ions tightly coupled, so the electron bulk velocity ve equals the 
baryon bulk velocity vp, 


Ve = Vp . (33.73) 
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Integration over the electron momentum brings the collision integral (33.71) to 


cle] = J (IMI?) [(6, - ~) - v» - O@,) + O(8,,)] 


160m? 


do.) 
Ar 


(33.74) 


Finally, the collision integral (33.74) must be integrated over the direction P~, of the scattered photon. The 
integration is facilitated if the angular dependence of the amplitude squared (|M|?) given by equation (33.55) 
is expanded in Legendre polynomials, $(1 + u?) = 2 [1 + 4P2(u)]. Inserting the amplitude squared (M|?) 
into the collision integral (33.74) brings it to 


f oe 7 . P ` doy 
c[O] E \7| / [1 EJ 5 P2(p, ` Py )] [(B,, ~ Py) ‘Vb — O(p,) + o(p, )] i ’ (33.75) 
where 7 = —fi-ora is the scattering rate (32.4). Equation (33.75) (unlike equation (33.74)) remains dimen- 


sionally correct even when units of c and fi are restored (both sides have units 1/7). The p, - vp term in the 
integrand of (33.75) is odd, and vanishes on angular integration: 

doy 
Ar 


J E E D eae (33.76) 
Similarly, the angular integral over the quadrupole of quantities independent of p,, vanishes: 


doy 


[re. - pv) [B, : vo — O(P, )] r (33.77) 
The collision integral (33.75) thus reduces to 
7 fo ` aoo n doy 
Clop) = lde me- oep) f ARG Ae Eb, (aa.78) 


where the dependence of various quantities on comoving position x has been made explicit. Now transform 
to Fourier space (in effect, replace comoving position x by comoving wavevector k). Replace the baryon bulk 
velocity by its scalar part, vp = ikv,. To perform the remaining angular integral over the photon direction 
Py, expand the Legendre polynomial P (p, Py) in the integrand in spherical harmonics using the addition 
theorem (33.103), expand the temperature fluctuation O©(k, p,,) in scalar multipole moments according to 
equation (33.47), and invoke orthogonality of the spherical harmonics. With u = k -p., these manipulations 
bring the photon collision integral (33.78) at last to 


C[O(k,B,)] = [+l [- invi (k) — O(k, p) + Oo(k) — 402(k)Pa(u)] - (33.79) 


33.10 Boltzmann equation for photons 


Inserting the collision term (33.79) into equation (33.44) with unit velocity, v = 1, yields the Boltzmann 
equation for the photon temperature fluctuation O(n, k, P), for scalar fluctuations in conformal Newtonian 
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gauge, 


E ikpO — Ë — ikpW = |7| [—ipvy — © + Oo — 4O2Po(u)] |. (33.80) 


Expanded into the scalar harmonics O¢(7,k), the photon Boltzmann equation (33.80) yields the hierarchy 
of photon multipole equations 


Oy —kO, —-®=0, (33.81a 
~ k k 1., 
Oı + 3 (Qo — 202) + zY = 3# (vp — 301) , (33.81b 
. k 9 
Os + 5 (20, = 303) = -iot Oo 3 (33.81¢ 
k 
Oe + —— [00-1 — (€+ 1) 0041] = -llOr (€> 3). (33.81d 


2241 


When polarization is included, the factor 5 on the right hand side of equation (33.81c) is decreased by a 
factor 2 to 3, Exercise 35.7, 


9 3 
ae ee eas 33.82 
10 4 ( ) 

The Boltzmann hierarchy (33.81) shows that all the photon multipoles except the photon monopole 09 are 
affected by electron-photon scattering, but only the photon dipole ©, depends directly on one of the baryon 
variables, the baryon bulk velocity vp}. The dependence on the baryon velocity vp reflects the fact that, to 
linear order, there is a transfer of momentum between photons and baryons, but no transfer of number or 
of energy. 


33.10.1 Truncating the photon Boltzmann hierarchy 


Photons are tightly coupled to baryons by scattering well before recombination, and stream freely well after 
recombination. The two regimes are sufficiently different to require different truncations of the Boltzmann 
hierarchy. 

As argued in §32.7, prior to recombination, when || is large, scattering keeps successive multipoles smaller 
by factors of k/|7|, equation (32.66). Keeping only the dominant ©p—ı term on the left hand side of the 
Boltzmann hierarchy (33.81) for > 2 implies 


(1+ §ôe2)lk 


Oa al 
f (22+ 1)|7| 


Oy, forl>2. (33.83) 
When polarization is included, the factor 1 + $5e2 = 2 for L = 2 on the right hand side of equation (33.83) 
is changed to 1 + 5602 = $ for £ = 2, Exercise 35.7. 

After recombination, photons stream freely, allowing the photon distribution to develop higher order 
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multipoles comparable to lower orders. A better approximation in the free-streaming regime is the same as 
that for neutrinos, equation (33.92), 


22—1 
Oi Ory Peete es for £>2. (33.84) 
n 


The truncation of the photon Boltzmann hierarchy adopted in equations (33.4) is an interpolation between 
the scattering and free-streaming regimes (33.83) and (33.84). 


33.11 Baryons 


The equations governing baryonic matter are similar to those governing non-baryonic cold dark matter, 
§33.3, except that baryons are collisional. Coulomb scattering between electrons and ions keep baryons 
tightly coupled to each other. Electron-photon scattering then couples baryons to photons. 

Since the unperturbed distribution of baryons is in thermodynamic equilibrium, the unperturbed collision 
term vanishes for each species of baryonic matter, as it did for photons, equation (33.59), 


Clfo] =0. (33.85) 


For the perturbed baryon distribution, only the first and second moments of the phase-space distribution 
are important, since these govern the baryon overdensity 6, and bulk velocity vp. The relevant collision term 
is the electron collision term associated with electron-photon scattering. Since electron-photon scattering 
neither creates nor destroys electrons, the zeroth moment of the electron collision term vanishes, 


I Cle] 2d. _ (33.86) 


The first moment of the electron collision term is most easily determined from the fact that electron-photon 
collisions must conserve the total momentum of electron and photons: 
1 2 dp 1 2 d?p 
C meve oe fe ——7 =0. 33.87 
/ [fe] ene me(27)3 [fy] Py Dx (27)3 ( ) 
Substituting the expression (33.78) for the photon collision integral into equation (33.87), separating out 
factors depending on the magnitude p, and direction p, of the photon momentum, and taking into consider- 
ation that the integral terms in equation (33.78), when multiplied by p,, are odd in p,, and therefore vanish 
on integration over directions p,, gives 


10) 
7 2d? e = Of, 2 2 4rp* dp, a A a doy 
[eta eal Me(2r)3 ada a ac pa (2a)? J [ Peter (6,)] Pi a (33.88) 


The integral over the magnitude p, of the photon momentum in equation (33.88) yields 4p, in accordance 
with equation (33.52). Transformed into Fourier space, and with only scalar terms retained, the collision 
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integral (33.88) becomes, with u =k “Py; 


A 1 2 dp, oL : do 
k fot MeVe m27)? = ‘p,n.ova | [ivy + O] u T 
4 
= gilera (vp — 301) . (33.89) 


The result is that the equations governing the baryon overdensity 6, and scalar bulk velocity vp look like 
those (33.33) governing non-baryonic cold dark matter, except that the velocity equation has an additional 
source (33.89) arising from momentum transfer with photons through electron-photon scattering: 


dp — kv, — 36 = 0 , (33.90a) 
int on tee =— Él m —30,) , (33.90b) 
a 


where R = 3 Dp / Py is 3 the baryon-to-photon density ratio, equation (32.46). 


33.12 Boltzmann equation for relativistic neutrinos 


Neutrinos oscillations indicate that at least two of the three neutrino types have mass, §42.4.15; but the 
masses are (probably) small enough that all three neutrinos types were relativistic until some time after 
recombination, equation (10.111). As long as neutrinos are relativistic, the hierarchy of Boltzmann equations 
is the same as that for photons, equations (33.81), but without the scattering terms, 


M- kM -@=0, (33.91a) 
M T : (No — 2M2) 4 Fy =0, (33.91b) 

k 
Ne ea Ne- —(€+1) Nex] =0 (€>2). (33.91c) 


The radiative transfer equation for neutrinos can be solved explicitly, equation (34.46). That solution, 
which involves an integral over the line of sight, provides one way to calculate the multipoles needed in 
the Einstein equations. However, computer codes that model the CMB commonly calculate the neutrino 
multipoles Mọ from the Boltzmann hierarchy (33.91) suitably truncated at some high harmonic €max- Since 
free streaming allows high neutrino multipoles to become comparable to the monopole and dipole well 
inside the horizon, it is not a good approximation simply to set neutrino multipoles to zero above some 
maximum harmonic. A better approximation, which emerges from the radiative transfer solution (34.46), is 
the approximation (34.49), 


Qlmax = 1 
—2 + ôe 20) > ——$—_ Np 


2 ax—l 3 
max max k max 
n 


Ne x — (Ne 


max 


(33.92) 


At superhorizon scales, the neutrino distribution was isotropic like any other species. But free-streaming 
allowed neutrinos to develop significant anisotropy once the scale entered the horizon. Prior to recombination, 
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neutrinos provided the principal quadrupole pressure that sourced the difference UV — ® of scalar potentials, 
Figure 33.4. In Exercise 33.5 you will find that, surprisingly, neutrino anisotropy sourced a finite difference 
W — Ẹ even in the initial superhorizon conditions where the neutrino monopole dominated. 


Exercise 33.5. Initial conditions in the presence of neutrinos. Prior to recombination, the neutrino 
quadrupole pressure is the dominant source for the difference Y — ® in scalar potentials, Figure 33.4. In 
this problem you will find that the neutrino quadrupole leads to a finite difference Ų — ® even in the 
initial conditions at superhorizon scales. Exercise 35.10 considers initial conditions for tensor fluctuations of 
neutrinos. 
1. Initially, only the neutrino monopole Np is finite. In the Boltzmann hierarchy (33.91) of equations, the 
lower order multipoles drive the higher multipoles, so that the equations reduce to the form Ne x No-1- 
Specifically, the Boltzmann hierarchy (33.91) reduces to, with y = kn, 


wn =0, (33.93a) 
ie No + ¥) , (33.93b) 
dy 3 
dNo l 


ee > 2). . 
y ae E22 (33.93c) 


Show that the initial (y < 1) behaviour of the neutrino multipoles is 


(—y)! 


M= D 


(M+) @>1). (33.94) 


2. Let fy and f, be the photon and neutrino fraction of the total radiation density, 


5 5 62 (4) 
haz Sl- SS Ee a: (33.95) 
Py + Pv Py + Pv 2+62 (5) 
Show that the Einstein energy equation (33.7a) implies, initially, 
—Pe wore), (33.96) 


where ¢, = fy¢,+ f.¢,. Assume that the photon quadrupole is negligible (why?). Show that the Einstein 
quadrupole pressure equation (33.7b) implies, initially, 


b-O=—4f,(U+6+¢). (33.97) 


3. Conclude that, for adiabatic initial conditions Ç, = ¢,, 


10¢, 


Tas ec 
15 +F4f, * 


= (14+2f,)U. (33.98) 
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33.13 Massive neutrinos 


Once neutrinos become non-relativistic, they start to behave like matter, clustering gravitationally like non- 
baryonic cold dark matter and baryons. Each massive neutrino type defines a free-streaming scale, equal to 
the characteristic comoving distance that the neutrinos can travel before redshifting to a halt. This free- 
streaming scale equals approximately the comoving horizon size at the redshift when the neutrino type 
became non-relativistic, equation (10.111). Massive neutrinos tend to depress the matter power spectrum at 
scales smaller than the neutrino free-streaming scale. 

The suppression of matter power below the free-streaming scale is substantial (exponential) if massive 
neutrinos are a dominant component of matter, a scenario termed hot dark matter, or HDM. White, Frenk, 
and Davis (1983) used the absence of such suppression in the observed galaxy power spectrum to rule out 
HDM models 30 years ago. 


33.13.1 Simplified treatment of massive neutrinos 


A full treatment of massive neutrinos, §33.13.2, requires integrating a Boltzmann hierarchy of multipole 
equations for each of a spectrum of neutrino momenta p,. This is more complicated than the massless case, 
where the fact that massless neutrinos follow the same null worldline regardless of the magnitude of their 
momentum implies that a single Boltzmann hierarchy covers all momenta. 

A simple approximate solution to the additional complexity introduced by mass is to assume an abrupt 
transition from relativistic to non-relativistic neutrinos at some time. This was the strategy suggested in 
Exercise 32.4. 

Another possible simplified strategy is to follow the Boltzmann hierarchy (33.100) for just a single repre- 
sentative neutrino momentum p, near the peak of the distribution, p,/T, = 1. 


33.13.2 Full treatment of massive neutrinos 


The Boltzmann equation for collisionless neutrinos with mass is equation (33.43) with zero collision term. For 
massive neutrinos, the neutrino velocity v, depends on momentum, so the neutrino temperature fluctuation 
N = 6T,(n,k, pu )/T (n) depends not only on the direction p, of the neutrino momentum, as in the massless 
case, but also on its magnitude p,. Since the temperature fluctuation M is already of first order, it suffices 
to treat the particle velocity v, in the Boltzmann equation (33.43) to unperturbed order. To unperturbed 
order, the momentum of a freely streaming neutrino redshifts as p, x a~', and the temperature of the 
unperturbed distribution redshifts in the same way, T, « a +. Thus it is convenient to characterize the 
neutrino temperature fluctuation as a function of the time-independent ratio p,/T,, 


N (n, k, pv) = N (n, k, pu /Tu, By) - (33.99) 


The harmonics Me (n, k, pu /T,) of the temperature fluctuation are functions of the ratio p, /T,. 
At the risk of being repetitious, the Boltzmann hierarchy of equations for a species of massive neutrino is, 
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equations (33.49), 


No — wkN, —®=0, 


Ki” OA a =, 
3 3Up 
Ne py Wee 1-(€+)Neyi]=0 (22), 


(33.100a) 


(33.100b) 


(33.100c) 


which differs from the massless neutrino hierarchy (33.91) in that it depends on the velocity v, = p,/E, of 
the neutrino. As long as neutrinos are relativistic, it suffices to follow a single hierarchy with v, = 1. But 
as neutrinos become non-relativistic, a full treatment requires following neutrino with different momenta 
py separately. In due course the neutrinos become non-relativistic, and the equations re-simplify to the 


non-relativistic limit. 


Because the massive neutrino multipoles N¢(p,/T,) depend on neutrino momentum p,, the perturbed 
neutrino energy-momentum tensor pel is more complicated than the massless case, equations (33.53). The 
perturbed energy density, energy flux, monopole pressure, and quadrupole pressure of massive neutrinos are 


te = f TE Nolo. /T) Bea , 
kote =i [Se Mlp / To) a 
bato = 4 | E Mpt) a ee 
($ kak, - 3 505) Te? = a 7 Nolpu/T,) = “eee 


33.14 Appendix: Legendre polynomials 


The Legendre polynomials P:(j:) satisfy the orthogonality relations 
. 2 
P, Po d = —— 6 1 
J PUDPe au = apm bu 


the addition theorem 
£ 


ka T 
the recurrence relation 
1 
P, = Po 1)P, 
uPi(u) = zy Palt) + (C4 1) Peli] 


(33.101a 
(33.101b 


(33.101c 


(33.101d 


(33.102) 


(33.103) 


(33.104) 
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and the derivative relation 
dP; (11) £ + 1 


ae ae [uPi-1 (u) — Pe+i(u)] - 


The first few Legendre polynomials are 


Pw=1, P()=u, Pauw) =—-3 +3’. 
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(33.105) 


(33.106) 


34 


Fluctuations in the Cosmic Microwave 
Background 


Since the first definitive observation of the amplitude of the first peak of the power spectrum of temperature 
fluctuations in the CMB by the Boomerang balloon-based experiment (Bernardis et al., 2000), the observed 
power spectrum of the CMB has allowed cosmological parameters to be measured with ever-increasing 
precision, and has provided the primary basis for the Standard Model of Cosmology. It should be emphasized 
that the CMB power spectrum is by no means the only evidence supporting the Standard Model. What 
gives confidence in the Standard Model is the fact that a broad range of other astronomical observations 
are consistent with it, including the Hubble diagram of Type I supernovae, the clustering of matter and of 
galaxies, Big Bang nucleosynthesis, and the age of the oldest stars. 

The power spectrum of the CMB depends on the harmonics O¢(7o, k) of the CMB photon distribution at 
the present time. A fast and elegant approach to calculating these harmonics was pointed out by Seljak and 
Zaldarriaga (1996). 


34.1 Radiative transfer of CMB photons 


To determine the harmonics ©¢(ņo, k) of the CMB photon distribution today, return to the Boltzmann 
equation (33.80) for the temperature fluctuation O(n,k,), where u = k- P, is the cosine of the angle 
between the wavevector k and the photon direction p,. It proves advantageous to rearrange the photon 
Boltzmann equation as 


ð 
(Z - tu + t) (0+) =14+/+|S, (34.1) 


which in this context is called the radiative transfer equation. The terms on the right hand side are 
source terms. The term J on the right hand side of the radiative transfer equation (34.1) is a monopole term, 
the Integrated Sachs-Wolfe (ISW) term, 


I(n,k) = U(n, k) + (n, k), (34.2) 


so-called because, as will be seen in equation (34.17), it contributes a temperature fluctuation that is an 
integral along the line of sight to the CMB. The term S on the right hand side of equation (34.1) embodies 
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source terms arising from Thomson scattering, Figure 35.1, a sum of monopole, dipole, and quadrupole 
harmonics, 


S(n,k, u) = Oo(n, k) + U(n, k) — inve(n, k) — 502(n, k)Pa(m) 


= $ (-4)" Sa (n, k) Pa (u) , (34.3) 


n=0 


with harmonic coefficients S,,(7, k), 


So=Oo+V, (34.4a) 
Si = Vp; (34.4b) 
So = 502 y (34.4c) 


The relatively simple structure of the Thomson scattering source functions (34.4), containing only monopole, 
dipole, and quadrupole contributions, stems from the simple structure (33.55) of the quantum mechanical 
amplitude squared (|M|?) for non-relativistic electron-photon scattering, which contains only monopole and 
quadrupole contributions. The dipole source S$; is a Doppler term from the motion of the photon-baryon 
fluid at velocity vp. 

The electron-photon (Thomson) scattering optical depth 7 is defined by equation (32.3), the integral along 
the line sight of the scattering rate 7, equation (32.4). The optical depth is zero, To = 0, at zero redshift, and 
increases going backwards in time 7) to higher redshift. The radiative transfer equation (34.1) can be written 
(note that 7 is negative) 


gri [eiten (O + v)] =I-7S. (34.5) 
n 


The solution for the photon distribution O(n, k, 4) today is obtained by integrating the radiative transfer 
equation (34.5) over the line of sight from the Big Bang (7 = 0) to the present time (7 = 70), 


no A 
O(no, k, u) + Y(n, k) = J [I(n, k) — +S(n, k, p)] e700- dn (34.6) 
0 


Notice that the left hand side of the solution (34.6) of the radiative transfer equation is not the temper- 
ature fluctuation O©(nņo, k, u) by itself, but rather the temperature fluctuation redshifted by the potential, 
O(m, k, u) + Y(n, k). The potential ¥(no, k) is independent of the photon direction p,, so contributes only 
to the monopole moment of the photon distribution. 


34.1.1 Visibility function 
Introduce the visibility function g(7) defined by 


Ce . (34.7) 
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Figure 34.1 Visibility function g(7) as a function of conformal time 7 in units of the conformal time today, no = 1. 
The visibility function here is calculated from the Peebles approximation to recombination, Exercise 31.6. The dashed 
vertical line marks the time rec of recombination, where the optical depth is one. The width Grec of recombination is 
the standard deviation of a Gaussian fit to the core of the visibility function. 


The visibility function g(7), illustrated in Figure 34.1, acts like a smoothing window over recombination. 
The visibility function is fairly narrowly peaked around recombination at 7 = tec, and its integral is one, 


No E e nS aa 
f senan= f e` dr = [e7] =1. (34.8) 


CO 


The visibility function g(7) has an approximately Gaussian core, and a long tail extending past recombination. 
The long tail arises because recombination leaves a finite residual electron density. 
The solution (34.6) of the radiative transfer equation can be written in terms of the visibility function 


g(n) as 
no , 
(mo, RH) + Uno, k) = | [e-71(n, k) + 9(n)S(n, k, p)] e70 dn . (34.9) 
0 


34.2 Harmonics of the CMB photon distribution 


If the temperature fluctuations on the CMB sky are statistically isotropic, then the statistical properties of 
the CMB commute with the rotation operator (the angular momentum operator), which implies that the 
power spectrum of CMB fluctuations is diagonal in a basis of eigenfunctions of the rotation operator. The 
eigenfunctions are spherical harmonics. Thus it is natural to expand the temperature fluctuation O(1, k, 2) + 
(no, k) in harmonics, equation (33.47). 
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The e~*#(—no) factor in the integrand on the right hand side of equation (34.9) can be expanded in 
harmonics through the general formula 


Co 


emim = (i)! (20 + 1) Paw) jely); (34.10) 
£=0 


where jely) = Vm/(2y)Je41/2(y) are spherical Bessel functions, and here y = k(n — no). The source func- 
tion S that premultiplies the factor e~**#("—") in the integrand of equation (34.9) is a sum of harmonics, 
equation (34.3). It is useful to introduce modified spherical Bessel functions jen (y) defined by an expansion 
analogous to (34.10), 


Co 


(—i)” Pa (we = XO (~i) (22 + 1) Pe(t1)jen(y) - (34.11) 
L=0 


The orthogonality relations of the Legendre polynomials, equation (33.102), imply that 


1 

Tn - d 

jen(y) = i fe E Pi (u) Pa) > (34.12) 
—1 


which implies that jen is symmetric or antisymmetric in its indices Zn as their difference /— n is even or odd, 


jen(y) = (~) jnely) - (34.13) 


The Legendre functions P,,(j) are polynomials in u, §33.14. Acting on e™*9#, these polynomials can be 
replaced by derivatives with respect to y through 


aN” _. 
pre UE (i=) enue (34.14) 
Oy 
The resulting modified spherical Bessel functions j,(y) with n = 0,1,2 relevant here are 
jo=je, ja=Je, je = 5je+ žiť, (34.15) 


where ’ denotes the total derivative, 7; = dje(y)/dy. The modified spherical Bessel functions are even or odd 


as jon(—y) = (—)'*"jen(y). The harmonic expansion of equation (34.9) is thus 


2 


Oc(n0, k) + 5eoV(no, k) = I fern k)je [k(n — n)] + 9(m) X Sn(n, k)jen [k(n — mh dn, (34.16) 


n=0 


where g(7) is the visibility function defined by equation (34.7). With the ISW and scattering source terms I 
and Sn written out explicitly, equation (34.16) is an integral from the Big Bang (7 = 0) to the present time 
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Figure 34.2 Illustrative example of the factors that go into the (left) monopole (n = 0) and (right) dipole (n = 1) 
contributions to the integrand of the solution (34.17) of the radiative transfer equation, as a function of conformal time 
n, in units no = 1. The example is for a representative wavenumber k/(deqHeq) = 2, and harmonic number £ = 200. 
The factors are the visibility function g(7), equation (34.7), scattering source terms Sn (n, k), equations (34.4), and 
modified spherical Bessel functions jen [k(no — n)], equations (34.15). The visibility function g(ņ) has been scaled to 
1 at its peak, and the monopole and dipole spherical Bessel functions je and jọ have been scaled so je equals 1 at its 
(first) peak. The cosmological model is as given in §32.3. The dashed vertical line marks the time frec of recombination, 
where the Thomson optical depth is one. 


O(n, k) + deo Y (no, k) = 


| i fe [E (n, k) + (n, k)] je (h(n — no) 


+9(n){ [O0 (n, k) + (n, k)] je lh — no) 


+ v(n, k)ja [k(n — no)] 
- 38201 Kje [e(n — no) } } an 


ISW 


monopole 
dipole 


quadrupole . 


(34.17) 


The term in the first line on the right hand side of equation (34.17) is an integral of the time derivative of 
the gravitational potential UV + ® over the line of sight, and is called the Integrated Sachs-Wolfe (ISW) 
effect. The remaining terms are linear combinations of the monopole, dipole, and quadrupole scattering 
source terms Sp, equations (34.4). Note that the monopole term (on both sides of equation (34.17)) is not 
the temperature fluctuation Oo by itself, but rather O9 + Y, which is the temperature fluctuation redshifted 
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by the potential Y. In the tight-coupling approximation, the baryon velocity on the third line approximates 
the photon velocity, vp © 301. 

Figure 34.2 shows an illustrative example of the factors that go into the monopole and dipole contributions 
to the integrand of the solution (34.17) of the radiative transfer equation. 


34.2.1 Harmonics of the CMB with respect to observed photon directions 


A final consideration is that the observed direction ’ of a photon from the CMB is opposite to the photon’s 
direction of motion, Ô = —p,. Photon multipoles oy expanded with respect to the direction n% of observation 
are obtained from ©, by flipping the sign of the photon direction, O% (n, k, u) = O(n, k, —u). Flipping the 


sign of p, is equivalent to flipping the sign of odd parity fluctuations, 
07™ (m, k) = (—)*e(nN0, k) . (34.18) 


Another way to achieve the sign flip is to flip the sign of the argument k(n — no) > k(no — 7) of the modified 
Bessel functions jen in equation (34.17) and simultaneously to flip the sign of the odd source functions Sy, 
namely Sı = vp > —vp. The CMB power spectrum involves products of pairs of @gbs with the same £, and 
is unaffected by the sign flip Ô = —p, in the solution (34.17) of the radiative transfer equation. 


34.2.2 Integrated Sachs-Wolfe (ISW) effect 


The first line of the solution (34.17) of the radiative transfer equation is an integral of the time derivative 
of the potential Y + ® along the line of sight to recombination. The contribution is called the Integrated 
Sachs-Wolfe (ISW) effect. If matter dominates the background, then the potential UV + ® is constant in time 
for linear fluctuations, and there is no ISW effect. In practice, there are “early” and “late” ISW effects that 
arise respectively from the contributions of radiation to the background density shortly after recombination, 
and of dark energy (and possibly curvature) near the present time. The late time scalar potentials Y and 
® (which are equal at late times) evolve in proportion to the growth factor g(a), equation (30.128) (not 
to be confused with the visibility function g(7)). The ISW integrand splits accordingly into early and late 
contributions, 


_,dvU+%) , d (=) _ (=) dg 

eT = eg peT H, 34.19 
dn Idn g g dn ( ) 
W early ISW late ISW 


The early and late time contributions to the ISW term are illustrated in Figure 34.3. 

In addition to the early and late ISW effects, there is a “nonlinear” ISW effect. Nonlinear gravitational 
clustering causes the potential ® to change in time, becoming deeper (more negative) in more highly clustered 
regions. Photons that travel through a cluster see a slightly deeper potential when they exit the cluster than 
when they entered it, causing the photon to be slightly redshifted. Figure 34.3 does not include the nonlinear 
ISW effect. 
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Figure 34.3 ISW integrand e77 (Ù + Ë) in equation (34.17) as a function of conformal time 7, in units yo = 1, for 
the standard flat ACDM model of §32.3. Curves are labelled with the wavenumber k/(deqHeq) in units of the Hubble 
distance at matter-radiation equality. In a matter-dominated cosmology, the gravitational potentials are constant, and 
the curves would all be zero. The high early values following recombination result from the contribution of radiation to 
the mean mass-energy density; this is the “early” ISW effect. The turn-up at later times results from the contribution 
of a cosmological constant to the mean mass-energy density; this is the “late” ISW effect, indicated by the dashed lines. 
The late time ISW contribution causes a slight turn-up in the CMB power spectrum at the largest scales, Figure 34.8, 
a characteristic signature of a cosmological constant. 


34.2.3 CMB transfer function in Fourier space 


As seen in Chapters 30-33, during linear evolution, scalar modes of given comoving wavevector k evolve 
with amplitude proportional to the initial curvature fluctuation ¢(k). The evolution of the amplitude may 
be encapsulated in a CMB transfer function T(n, k) defined by 


Oo(n, k) =F ôo Y(n, k) 
C(k) 


with O(n, k) + de9V (n, k) computed from equation (34.17). By isotropy, the CMB transfer function Te(n, k) 
is a function only of the magnitude k of the wavevector k. 

The square of the transfer function transforms the primordial power spectrum P¢(k) defined by equa- 
tion (30.132) into the CMB power spectrum C?(7,k) in Fourier space, equation (34.27), which is in turn 
related to the observed CMB power spectrum C?(7o) in real space today by equation (34.34). 


Te(n, k) = 


(34.20) 
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Figure 34.4 (Continued on the next page.) CMB transfer functions Te(no, k) for a selection of harmonics £, plotted 
(left) linearly, showing the oscillating functions and their envelopes, and (right) logarithmically over a broader range 
of wavenumber k, showing only the envelopes of the underlying oscillating functions. The total (black) is a sum of 
the various contributions in equation (34.17): monopole (dark blue), dipole (light blue), quadrupole (cyan), early ISW 
(purple), and late ISW (red). The total envelope (black) omits the late ISW contribution, since the late ISW is non- 
oscillatory where it is important (at small £ and small k). The cosmological model is the flat ACDM concordance model 
of §32.3. The computation is a Boltzmann computation including photon and neutrino multipoles up to max = 16. 
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Figure 34.4 continued. 
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Figure 34.4 shows CMB transfer functions Tg (no, k) at the present time, n = no, for a selection of harmonics, 
£ = 2, 20, 200, and 2000. The CMB transfer functions are calculated by integrating numerically, for each of 
many wavenumbers k, the solution (34.17) of the radiative transfer equation. The CMB transfer functions 


34.2 Harmonics of the CMB photon distribution 909 


shown in Figure 34.4 are from a Boltzmann computation including photon and neutrino multipoles up to 
Lmax = 16. 

Spherical Bessel functions j¢(y) are small for y < 4, rise to their first peak at y ~ £+ 41/3, and are then 
oscillating and declining at y > £. This behaviour translates into a similar behaviour in the CMB transfer 
functions Ty(o,&), Figure 34.4. The transfer functions are small for k(no — Mec) < £, peak at 


k(no = Nrec) ~£ + 41/3 > (34.21) 


and then oscillate at k( — rec) >> £ with an exponentially declining envelope, as illustrated in the right 
panels of Figure 34.4, 


Te (no, k) [envelope x exp (—kno/600) = (34.22) 


The exponential decline is caused in part by dissipative processes around the time of recombination, §32.7 
and §32.8, and in part by the finite width of recombination, which tends to smooth over oscillating source 
functions S at large wavenumber k, §34.2.4. 

Besides the total, Figure 34.4 shows the monopole, dipole, quadrupole, and early and late ISW contribu- 
tions to the CMB transfer functions. The contributions are, excepting late ISW, highly oscillatory, thanks 
to the Bessel factors jen [k(n — no)] in the integrand of equation (34.17). The Figure therefore shows also the 
envelope of the oscillatory contributions. The envelope is computed as an integral in which the Bessel factor 
jen(y) in the integrand is replaced by the non-oscillatory absolute value of the complex Hankel factor hen (y), 


0 ee ade (Coors le 


te (34.23) 


NIe dle 


hen(y) = Jen(y) + 


NIP Nie 


iyen(y) lyl>e4 pe 
with yen (y) the modified spherical Bessel function of the second kind (whereas jen (y) is the modified spherical 
Bessel function of the first kind). The cut at y = + $ + (£ + 4)'/%, which is roughly the location of the 
first zero of yen(y), is introduced to prevent the diverging behaviour of ye,,(y) as y — 0 from dominating the 
integral. 


34.2.4 Instantaneous and rapid recombination approximations 


At wavelengths much larger than the width of recombination, koyee < 1, recombination can be approximated 
as instantaneous. In the instantaneous recombination approximation, the visibility function g(7) is a 
delta-function at n = rec, and, without the ISW term, the multipoles O¢(, k) of the temperature fluctuation 
today are 

2 


Ox(no, k) + deoW (No, k) x 5 Sn(Nrecs k) jen [k(Mrec = no)] ` (34.24) 


n=0 


A better approximation that works also at larger k is the rapid recombination approximation, which 
replaces the source functions S by their averages S over recombination. In the rapid approximation, the 


910 Fluctuations in the Cosmic Microwave Background 


1.0 T TTT TTT T Te TTT] ToT TTqt 


BE — 


source functions S at recombination 


1.0 | pirul l pirul } ot S R EEA 


A 1 10 100 


wavenumber k/ (4ggH, eq) 


Figure 34.5 Rapid and instantaneous approximations to the Thomson scattering and early ISW source functions at 
recombination, equations (34.25), as a function of wavenumber k/(aeqHeq). The solid (bluish) lines are the values 5n (k) 
averaged over the visibility function g(7), while the dashed (greenish) lines are the instantaneous values Sn (rec, k) 
at recombination, where the Thomson optical depth is one. The purple line is the averaged early ISW source function 
Scarly(K). The source functions are normalized to unit primordial curvature, ¢(k) = 1 (in other words, the plotted 
source functions are transfer functions). The computation is the hydrodynamic approximation (Boltzmann with max = 
2), since this turns out to yield a better rapid recombination approximation to the CMB power spectrum than a full 
Boltzmann treatment, Figure 34.7. The dipole source term is taken to be 301 (the tight-coupling limit), not vp, since 
this yields a better rapid recombination approximation. The cosmological model is the standard flat ACDM model 
described in §32.3. 


temperature multipoles O¢(7o,k) today are, including the early ISW, monopole, dipole, and quadrupole 
contributions, 


O2(no, k) a deo Y (no, k) x Searly (k) je [k (Nearly = no)] + 5 Sn(k)jen [K(Nrec = no)] ’ (34.25a) 
n=0 

‘a — ~% —T d V(n, k) F O(n, k) ‘al z= m 

Seariy (k) = f De ( 7 ) dn, Sn(k)= f g(n)Sn(n,k) dn , (34.25b) 


where in the early ISW term g denotes the growth factor (30.128) rather than the visibility function g(n). 
The early ISW effect peaks at a redshift zeariy ~ 900 slightly after the redshift zrec ~ 1100 of recombination, 
Figure 34.3, but because the conformal time no today is so much larger than Mec, the difference between 
Nearly — No and rec — No is slight enough that it is fair to approximate Nearly © Nrec- 

Figure 34.5 shows the early ISW, monopole, dipole, and quadrupole source terms that go into the instan- 
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Figure 34.6 CMB transfer functions computed from a Boltzmann treatment (solid lines) including photon and neutrino 
multipoles up to max = 16, compared to their values in the rapid recombination approximation (dotted lines) with 
the source functions taken from the hydrodynamic approximation, for a selection of harmonics £, as marked. All lines 
are the envelopes of the underlying rapidly oscillating transfer functions. The left and right panels are the same, 
but with wavenumber k plotted linearly on the left, logarithmically on the right. The transfer functions plotted here 
include monopole, dipole, quadrupole, and early ISW contributions, but exclude the late ISW contribution. The dipole 
contribution to the rapid recombination approximation is computed from the photon velocity 30, (the tight-coupling 
limit), not the baryon velocity vp. The cosmological model is the standard flat ACDM model described in §32.3. 


taneous approximation (dashed lines) and the rapid recombination approximation (solid lines). The instan- 
taneous and rapid approximations Sn(mrec, k) and $,,(k) to the Thomson scattering source functions agree 
at small wavenumbers k, where the source terms are slowly varying over the visibility function. The rapid 
approximation works also at larger wavenumbers, where the source functions S,,(7,k) change significantly 
over the course of recombination. Averaging over recombination tends to reduce the Thomson scattering 
source functions compared to their instantaneous values at recombination. 

The baryon velocity decouples from the photon velocity during recombination, and grows large as baryons 
fall into the dark matter potential wells, as illustrated in Figures 32.1 or 33.1. As a result, the rapid ap- 
proximation tends to overestimate the true dipole contribution if the baryon bulk velocity vp is used for the 
dipole source term 5). A simple empirical fix is to use the bulk photon velocity v, = 30, in place of the 
baryon velocity to compute S1. This fix is adopted in Figures 34.5 and 34.6. 

Figure 34.6 compares (envelopes of the rapidly oscillating) CMB transfer functions T(n, k) computed 
from a Boltzmann treatment to their values in the rapid recombination approximation, equation (34.25), 
with source functions computed in the hydrodynamic approximation, for a selection of harmonics £. Whereas 
the envelopes computed in the Boltzmann treatment are rather smooth at large wavenumber k (and exponen- 
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tially declining, equation (34.22)), the envelopes computed in the rapid recombination approximation remain 
somewhat oscillatory at large wavenumber k. However, what is important is that the approximation yields 
approximately the correct overall amplitude of the transfer functions; the CMB power spectrum (34.35) 
involves integrating over transfer functions, which washes out the residual oscillatory structure. The hy- 
drodynamic approximation (rather than a Boltzmann treatment) is used for the source functions because 
it (hydro + rapid) turns out to give a yield better approximation (than Boltzmann + rapid) to the CMB 
power spectrum, as illustrated in the right panel of Figure 34.7. 


34.2.5 CMB power spectrum in Fourier space 


The power spectrum C¢(7,k) in Fourier space is defined to be the expectation value of the variance of 
temperature multipoles O,(n, k), 


ere 
ÅT 


(27)*8p(k" + k)Ce(n, k) = ([Ov (n, k') + deot (n, k')] [Oe(n, k) + dot (n, k))) - (34.26) 


The power spectrum Ce(n, k) is real-valued. The momentum conserving delta-function (27)?6p(k' + k) is a 
consequence of the assumed statistical homogeneity of space, while the angular-momentum conserving delta- 
function de is a consequence of the assumed statistical isotropy of space. By isotropy, the power spectrum 
Cy(n, k) is a function only of the magnitude k of the wavevector k. The monopole power Co(n, k) is defined 
to be the variance of the redshifted monopole 0, + WV because that is what appears in the solution (34.17) 
of the radiative transfer equation. 

In terms of the CMB transfer function (34.20) and the primordial power spectrum P¿(k) defined by 
equation (30.132), the CMB power spectrum Ce(n, k) is 


Ce(n, k) = 4r |Te(n, k)|? Plk) . (34.27) 


34.3 CMB in real space 


34.3.1 CMB harmonics in real space 


The solution (34.17) of the radiative transfer equation is in terms of photon multipoles O(n, k) in Fourier 
space, but astronomers observe the CMB in real space. The real-space temperature fluctuation O(n, x, À) at 
time 7 and comoving position x in observed direction à on the sky is related to the Fourier-space temperature 
fluctuation by 

dk 
(27)? 


Omg, ñ) = | eol, k, h) (34.28) 


Astronomers observe the temperature fluctuation O(no, £o, Ô) now, at time no, and here, at position £o. 
Without loss of generality, our position can be taken to be at the origin, £o = 0, in which case the phase 
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factor is unity, e71% z0 = 1, and can be omitted, 


O (no, Lo, Ô j= fo (mo, k, ù) (34.29) 


The spherical harmonic expansion of the observed ee, temperature fluctuation today is, with a 
conventional choice of normalization of harmonics Ogm, 


oe) g 
Olm, xo) =X XO Oml, £0) Yén (ô) . (34.30) 

l=0 m=—£ 
The sum includes the monopole £ = 0 harmonic because the mean temperature of the observable Universe 
may differ from the “true” mean temperature of the Universe. From the perspective of statistics, such a 
difference between the observed and true mean temperature can exist even though it is unobservable to an 
astronomer confined to position a. An astronomer in a cosmologically distant future when the horizon is 
much larger than today would be able to measure the difference. The spherical harmonic expansion (33.47) of 
the Fourier-space temperature fluctuation may be written, in view of the relation (33.103) between Legendre 
polynomials and spherical harmonics 


love) £ 
Olmok, ô) =X XO (vi) 4rOeln, k)Yom (ÉY fi, (A) - (34.31) 
L=0 m=—£ 


From equations (34.29)—(34.31) it follows that the real-space photon harmonics are 
dk 


Oem (to, £0) = an(—i)’ f Orl: K) Yim) os (any (34.32) 


34.3.2 CMB power spectrum in real space 


The CMB power spectrum Ce(no) on the sky today is defined to be the expectation value of the variance of 
temperature multipoles Og (70, Zo), 


50 m'mCe(No) = (lOt (Mo, Lo) + 5eoY (10, Lo)] [Oem (10, £0) + 5eo¥ (No; £0)]) - (34.33) 


The power spectrum Ce(no) is real-valued. By homogeneity, the power spectrum Ce(no) is independent of 
observer position £o. The real-space monopole harmonic O90(70, £0) + (0, £o) is the temperature fluctu- 
ation gravitationally redshifted by the potential ¥ (n0, £o) at our position today. From the perspective of an 
observer at fixed position ao, the redshifted monopole is observationally indistinguishable from a rescaling 
of the mean temperature. 

From the expression (34.32) for the real-space harmonics in terms of Fourier-space harmonics, together 
with the power spectrum (34.26) of the Fourier-space harmonics, it follows that the power spectrum Ce(no) 
of real-space harmonics of the CMB today is 


4rk?dk 
Cim) = J t | 


“Ome (34.34) 
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In terms of the CMB transfer function Ty and primordial curvature power spectrum P¢ or its dimensionless 
equivalent A?, equation (30.134), the power spectrum C¢(79) is, from equation (34.27), 


Ank?dk 


dk 


An J [Ze(no, k)I? ARC) = . (34.35) 


Cilo) = 4m f (Telo P Pele) a = 


If the primordial power spectrum Ae is a power-law with tilt n, equation (30.137), then the CMB power 
spectrum today is 


k n—1 k 
) La (34.36) 


Clm =ar [Tm OP (GE) | 

0 p 

As discussed in §34.2.3, the CMB transfer functions Te(no, k) are small for k(no — Mec) < 4, peak near 

k(no — Nrec) ~ £ (or more precisely, at harmonics slightly larger than £, equation (34.21)), and then oscillate 

with an exponentially declining envelope, equation (34.22). Thus the power spectrum Ce(no) (34.35) at 

harmonic £ principally probes comoving scales 1/k that are 1/2 times the comoving distance no — tec to 
recombination today, 


1 no rec 
oN oO : 


Physically, harmonic number £ probes angular scale 0 ~ 7/€ on the sky, and the power spectrum at harmonic 
number £ probes comoving scale m/k ~ (no — Nec) on the CMB sky. 


34.3.3 Rapid recombination approximation to the CMB power spectrum 


Modern, publicly available codes such as CAMB compute an entire model CMB power spectrum Ce(no) in 
just a few seconds, which is amazingly fast. CAMB is tuned for speed, doing only enough calculations as are 
needed to achieve a desired accuracy. CAMB is written in a fast language, parallelized fortran 90. If you’d 
like to write a code that competes with CAMB in speed, expect to invest a substantial time developing it. 
It’s more than just an exercise. 

Meanwhile, the rapid recombination approximation, §34.2.4, offers a short-cut to computing the CMB 
power spectrum that at least captures qualitative features. The rapid recombination approximation effectively 
sidesteps step 5 of the numerical computation outlined in §30.8. 


Exercise 34.1. CMB power spectrum in the instantaneous and rapid recombination approxi- 
mations. Compute the CMB power spectrum Ce(no) today in your choice of the instantaneous and rapid 
recombination approximations, equations (34.24) or (34.25), with source functions calculated in your choice 
of level of detail, simple, §30.7, hydrodynamic, §32.2, or full Boltzmann, §33.1). Discuss. 

Solution. See Figures 34.7 and 34.8. I used the standard flat ACDM cosmological parameters given in §32.3, 
and the normalization of the power spectrum measured from Planck, equation (30.138). I used Mathemat- 
ica to solve the evolutionary equations in the simple, hydrodynamic, Boltzmann approaches. But for the 
integral (34.35), I abandoned fighting Mathematica, and resorted to a publicly available implementation of 
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Figure 34.7 Model CMB power spectra computed in the rapid recombination approximation, equation (34.25), with 
source functions computed in: (left) the simple approximation, §30.7, without (short dashed line) and with (long dashed 
line) artificial damping, equations (30.59) and (30.60) with € = 1078; and (right) the hydrodynamic approximation 
($32.2, short dashed line), and in a full Boltzmann treatment (§33.1, long dashed line) including photon and neutrino 
multipoles up to max = 16. The solid (black) lines are a reference model power spectrum computed with CAMB. 
The CAMB spectrum is similar to that shown in Figure 10.3, but without refinements from reionization and lensing. 


Bessel functions (Amos, 1986), and a cubic spline integration implemented in fortran. In Figure 34.8 (but 
not Figure 34.7) I added the late ISW contribution. The late ISW transfer function is not oscillatory, so its 
computation from integration of the derivative of the growth function over the line of sight, equations (34.17) 
and (34.19), is numerically straightforward. Comments: 

1. The hydrodynamic and Boltzmann computations get the phasing of peaks more or less right. The 


phasing of peaks depends on the sound speed in the photon-baryon fluid, which depends on the baryon- 
to-photon density ratio. The agreement with the hydrodynamic and Boltzmann computations supports 
the standard model, where the baryonic density begins to become comparable to the photon density near 
recombination, equation (32.46). The simple approximation gets the phasing slightly wrong because it 
neglects baryons. 

. The overall angular location of the peaks is correctly reproduced. The overall angular location of peaks 
depends on geometry, that is, on the apparent angular size of comoving distances at recombination 
observed by astronomers on Earth today. The geometry depends on various cosmological parameters, 
notably the curvature Q% and the Hubble parameter Ho today. 

. The power spectrum is roughly constant and dominated by the monopole at the largest scales, ¢ < 40. 
This is the Sachs-Wolfe plateau, §34.5, a signature of a near-scale-invariant primordial power spectrum. 
The weak minimum at £ ~ 20 results mainly from a cancellation between the monopole ©ọ + ¥ and 
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Figure 34.8 Monopole + early ISW (0), dipole (1), quadrupole (2), and late ISW contributions to the CMB power 
spectrum in the rapid recombination approximation with source functions computed in the hydrodynamic approxi- 
mation (top model in the right panel of Figure 34.7). The monopole and early ISW are combined into a single curve, 
labelled 0, since they are highly correlated. To a good approximation, the total power spectrum is an incoherent sum 
of the monopole and dipole power spectra, the quadrupole contribution being quite small. There are sub-dominant 
cross-correlations between the various contributions, which are not plotted separately here, but which are included in 
the total (black line). The late ISW contribution is computed from integration of the derivative of the growth function 
over the line of sight, equations (34.17) and (34.19). 


early ISW contributions, as might be expected from Figure 34.5. The late ISW effect contributes a small 
enhancement in power in the first several harmonics. 

4. The even peaks are stronger than the odd peaks in the hydrodynamic and Boltzmann computations. 
The difference in strengths between even and odd peaks is caused by baryon loading, §32.10, in which the 
extra gravity generated by baryons in the oscillating photon-baryon fluid enhances even (compression) 
peaks and weakens odd (rarefaction) peaks. The simple approximation does not show the even-odd 
variation because it treats baryons as having negligible density. 

5. The power spectrum Ce(no) declines approximately exponentially with harmonic number £. The decline 
arises partly from dissipative processes around the time of recombination, §32.7 and §32.8, and in part 
from the finite width of recombination, §34.2.4. 


Exercise 34.2. CMB power spectra from CAMB. Compute model CMB power spectra from a pub- 
licly available code such as CAMB (google it). Vary the cosmological parameters. Compare to published 
measurements from Planck or other sources (google it). Formulate a question, and attempt to answer it. For 
example, what does the observed power spectrum say about: 

1. non-baryonic cold dark matter; 
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baryons; 
photons; 
neutrinos; 
dark energy; 
curvature; 


Sok ee IS 


the origin of fluctuations? 


34.4 Observing CMB power 


The power spectrum C¢(79), equation (34.34), gives an expectation value for the variance (34.33) of CMB tem- 
perature fluctuations on the sky, which can be compared to observation. Isotropy predicts that Om (70, £o) 
with different £ or m should be uncorrelated, a prediction that can be tested by observation. 

Inflation, which predicts that fluctuations are generated by quantum fluctuations of the scalar inflaton field 
that supposedly drove inflation, generically predicts a Gaussian distribution of fluctuations in the primordial 
curvature Ç. This in turn implies a Gaussian distribution of temperature fluctuations © as long as the 
fluctuations remain in the linear regime. The Gaussian distribution of temperature fluctuations © = ôT/T 
is characterized entirely by its variance, the power spectrum Cy. 

For each harmonic number £, there are 2 + 1 harmonics ©,,, with the same £ but different m. Isotropy 
predicts that the expected variance is the same, Cy, for each m. Thus one way to estimate the variance Cy 
is to take 


£ 
1 
m=—£ 


The finite number 2¢ + 1 of modes at each £ places a fundamental fractional uncertainty of ~ 1/22 + 1 on 
the accuracy with which Cọ can be determined observationally. This fundamental limit, which arises from 
the finite size of the observable Universe, is called cosmic variance. 

In practice there are numerous issues that complicate the measurement of the CMB power spectrum Cy, 
including incomplete sky coverage, contamination by Earth glow, microwave foregrounds arising from galactic 
and extragalactic synchrotron radiation, dust, and free-free emission, and observational and detector noise 
and systematics of one sort or another. 


34.5 Large-scale CMB fluctuations (Sachs-Wolfe effect) 


The behaviour of the CMB power spectrum at the largest angular scales was first predicted by Sachs and 
Wolfe (1967), and is therefore called the Sachs-Wolfe effect, though why it should be called an effect is 
mysterious. The Sachs-Wolfe (SW) effect is distinct from the Integrated Sachs-Wolfe (ISW) effect. The ISW 
effect, ignored in this section, was considered in §34.2.2. 
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At scales much larger than the sound horizon at recombination, khs rece < 1, the redshifted monopole 
fluctuation 99 (Nrec, k) + U(Nrec, k) at recombination is much larger than the dipole ©; (tec, k) or quadrupole 
02(Mrec, k), so only the monopole contributes materially to the temperature multipoles O¢(7o, k) today. The 
redshifted monopole contribution to the temperature multipoles O¢(7,k) today is, from equation (34.17), 


Oc(no, k) + deo (no, k) = [O0 (Mrec, k) + Y (mec, k)| je [k(n — Mrec)] - (34.39) 


At superhorizon scales knee < 1, the radiation monopole at the time hrec of recombination is given by the 
superhorizon solution O09 — ® = ¢,, equation (30.63b), so 


Oo (Nec: k) =P U (Mec; k) = Wsuper (Mrec, k) + Pauper (Trees k) T G (k) i (34.40) 


The CMB transfer function T(n, k) is conventionally normalized to the primordial curvature fluctuation ¢(k), 
equation (34.20). For adiabatic fluctuations ¢ is the same for all species; more generally, ¢ could be different for 
different species. For definiteness, take the simple two-component matter plus radiation model of Chapter 30, 
where the superhorizon potential in the late matter-dominated regime is ®(late) = = 36, equation (30.68), 
for both adiabatic and isocurvature initial conditions. In the approximation that recombination is in the 
matter-dominated regime (which is not quite true), and the scalar potentials are equal (which is again 
not quite true), so U(trec) + ®(trec) ~ 2super (late) = -Ek the CMB transfer function for the radiation 
monopole at superhorizon scales, normalized to Çe, is 

Usuper(Nrec; k) + super (Nrec; k) + Cy (k) 6 I ¢y(k) 


n= Celk) ~~ 5 Gk) 


The monopole transfer function To (rec, k) at recombination is thus approximately constant at superhorizon 


-4 adiabatic , 
5 (34.41) 


= isocurvature . 


scales, although the value of the constant depends on the initial conditions. 
At superhorizon scales, the CMB transfer function Te(no, k) in the @’th harmonic today is, from equa- 
tion (34.39), 


Te(no, k) = To (rec, k) Je [k(n — Nrec)] - (34.42) 
The resulting CMB angular power spectrum at superhorizon scales ki). rec < 1 is 
ae dk 
Ce(no) = AtTo (ree; up f Je [k(no = tee I Ac(k) T 5 (34.43) 
0 


where To(Nrec, k), being approximately constant, equation (34.41), has been taken outside the integral. If the 
primordial curvature power spectrum Ae (k) is a power law with tilt n, equation (30.137), then the integral 
over the squared Bessel function can be done analytically, equation (34.56b), yielding 


Ce(no) = 4r To (hrec, k)” AZ [1/ (10 — Mrec)] Uee(r — 1) . (34.44) 


For the particular case of a scale-invariant primordial power spectrum, n = 1, the CMB power spectrum C 
at large scales today is given by 


£(€ + 1)Ce(no) 


r = To(nrec, k)? AŽ [1/(0 — nrec)] ifn=1. (34.45) 
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Thus the characteristic feature of a scale-invariant primordial power spectrum, n = 1, is that (£ + 1)Ce 
should be approximately constant at the largest angular scales, £ < 1o/MNrec. The normalization factor 
1/(27) converts to the power of large scale fluctuations in the potential at recombination. This is the reason 
that CMB folk routinely plot (£ + 1)C¢/(27) rather than Ce. 


34.6 Radiative transfer of neutrinos 


Neutrinos decouple not at recombination, but rather after electron-positron annihilation at a redshift 1+ 2 ~ 
10°. From that point neutrinos streamed freely. The horizon distance n, at neutrino decoupling relative to 
that at matter-radiation equality was nv /TNeq ~ 1075. As with radiation, inflation predicts that initially the 
neutrino distribution was isotropic at superhorizon scales, with only a monopole mode present. But once 
a mode entered the horizon, without collisions to isotropize their distribution, freely streaming neutrinos 
could develop appreciable higher multipole moments, Figure 33.2. Prior to recombination, the neutrino 
quadrupole provided the dominant source for the difference Y — ® between the scalar potentials, Figure 33.4. 
In Exercise 33.5 you discovered that the neutrino quadrupole causes a finite difference Y — ® even in the 
superhorizon initial conditions, equation (33.97). 

Observationally accessible scales in the CMB or in the clustering of matter are large compared to the 
horizon distance 7, at neutrino decoupling. At such large scales, kn, <1, only the neutrino monopole No 
was present at neutrino decoupling. The neutrino analogue to the solution (34.17) of the radiative transfer 
equation is then 


No(n, k) + 500% (n, k) a [Ù (1, k) + &(n',k)] je (k(n! — n)] dn! ISW 


+ [M0 (0, k) + U0, k)| je(—kn) monopole , 


(34.46) 


which contains only Integrated Sachs-Wolfe and dipole terms. In equation (34.46), the time ņ, of neutrino 
decoupling has been replaced by zero, and the optical depth factor e~7 omitted, since the neutrino decoupling 
scale is so much smaller than cosmological scales. 

Equation (34.46) holds at any time ņ after neutrino decoupling, as long as the neutrinos remain relativistic. 
Neutrino oscillation data suggest that at least 2 of the 3 neutrino types are massive, with masses at least 
0.01 eV and 0.05 eV (see §42.4.15). Such neutrinos would have become non-relativistic at a redshift of 1+ z = 
60 and 300 respectively. However, all 3 neutrino types were relativistic prior to and at recombination, when 
the physics of dark matter and the photon-baryon fluid was imposing its imprint on the CMB. 


34.6.1 Truncating the neutrino Boltzmann hierarchy 


The integral solution (34.46) provides one way to compute neutrino multipoles of arbitrary order. The solution 
is equivalent to solving the entire collisionless Boltzmann hierarchy of differential equations for neutrinos. 
However, it is more common for computer codes to solve for neutrino multipoles using the Boltzmann 
hierarchy truncated in a suitable fashion. The strategy of setting multipoles above some maximum harmonic 
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to zero does not work well for neutrinos, because free-streaming allows neutrinos to develop higher order 
multipoles comparable to the monopole and dipole. An alternative strategy for truncating the neutrino 
hierarchy, described immediately following, was proposed by Ma and Bertschinger (1995). 

Spherical Bessel functions are related by 


2243 


jely) jely) + je+2(y)=0. (34.47) 


This motivates considering the combination (Ne + Wdeq) + (204+ 3)Ne41/y + No42, with y = kn, of neutrino 
multipoles, which has the property that the monopole term from the second line of equation (34.46) vanishes. 
The ISW term on the first line of equation (34.46) gives a non-vanishing contribution to the combination, 


which is, with the identity 1/y = y’/[y(y’ — y)| — 1/(y'— y), 
2L+3 


y 1 / a 1 
Na + oY 4 Nesi + Nep = (2+3) | olro J TOW )| Y Jey — y) 
y 0 Oy y (y’—y) 

The integrand on the right hand side of equation (34.48) is everywhere finite, and for y > y’ is of order 
y' /y? times the integrand of the ISW integral in equation (34.46). In the actual case, UV + ® varies rapidly 
at horizon-crossing, y’ ~ 1, but subsequently varies slowly, Figure 33.2. In this case the integral on the right 
hand side of equation (34.50) is small compared to N12 for y >> 1. The integral is also small for y < £+ 1, 
since jeri (y — y)/(y’— y) ~ (y' — y)*/(20 + 1)! for 0 < y' < y< +1. The approximation that the integral 
is small is better for larger harmonic number £. 

If the integral on the right hand side of equation (34.48) is neglected, which becomes an increasingly good 
approximation at higher £, then 


dy’ . (34.48) 


20+ 3 
No2 N (Ne } deo Y) Bi Neyi . (34.49) 


Ma and Bertschinger (1995) proposed truncating the neutrino Boltzmann hierarchy by using the approxi- 
mation (34.49) at some suitably high harmonic number £. The approximation is worst around epochs where 
W + ® varies rapidly, such as around horizon-crossing, kn ~ 1. 


34.6.2 Approximate neutrino quadrupole 
The neutrino quadrupole M is of special interest because it is a principal source for the difference Y — ® in 
scalar potentials. For the quadrupole, the approximation (34.49) is 


3M1 
kn ` 


(34.50) 


The approximation (34.50) is not adequate for precision modelling, but it provides the basis for the ap- 
proximation of neutrinos as an imperfect fluid, equation (32.11). It is a better approximation than simply 
setting the neutrino quadrupole to zero, M2 = 0. The approximation (34.50) leads to a second order differ- 
ential equation for the neutrino monopole, equation (32.91), which allows the behaviour of neutrinos to be 
explored qualitatively, Exercise 32.7. 
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Exercise 34.3. Cosmic Neutrino Background. 

1. Is there a Cosmic Neutrino Background? Think about whether neutrinos are relativistic or non-relativistic 
today. 

2. Suppose that one neutrino is relativistic. Calculate the power spectrum of the Cosmic Neutrino Back- 
ground for that neutrino in the approximation that the ISW contribution is negligible. 

3. What is the effect of the ISW contribution resulting from the change in the potential when neutrinos 
entered the horizon in the radiation-dominated regime? 

Solution. 

1. Neutrinos with the masses of (at least) 0.01eV and 0.05eV suggested by neutrino oscillation data 
(see §42.4.15) would have become non-relativistic at a redshift of 1 +z œ~ 60 and 300 respectively, 
whereupon they would start to cluster like dark matter and baryons, rather than continuing to stream 
like cosmic background photons in more or less straight lines into astronomers’ telescopes. There remains 
the possibility that one of the neutrino types may be light enough, m, < 1074 eV, to be relativistic today. 
Such a relativistic neutrino would produce a background today that is an imprint of fluctuations in the 
Universe at the time of neutrino decoupling. 

2. For a light, relativistic neutrino, the multipole moments of the cosmic background today are given by 
equation (34.46) with 7 = no. Without the ISW term, only the monopole term remains, 


No(t0,k) + deo (no, k) = No(0, k) + Y(0, k)] je(kno) - (34.51) 
The initial value is the superhorizon result 
No(0,k) + Y0, k) = U(0)+ (0) +. (34.52) 


The neutrino power spectrum is proportional to the photon Sachs-Wolfe power spectrum (34.43), with 
constant of proportionality 


Cy) ( Sore j (34.53) 
CSW UVsuper(Nrec) + Duper (Nrec) + Gy 


In the approximation that recombination is in the matter-dominated regime (which is not quite true), 
and the scalar potentials are equal (which is not quite true thanks to neutrinos), the potentials at 
recombination are approximately the late time potentials given by equations (30.68), so 


v 2 
cy” x (= pete) (34.54) 
OPAN E ie + Gy 


Inflation generically predicts adiabatic fluctuations with ¢’s of all species the same, in which case 


cP 5 2 
aa N a 34.55 
ow a) cone 
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3. An ISW effect results from the change in the potential at horizon-crossing for modes that entered the 
horizon during the radiation-dominated era. The potential U(y) + ®(y) is a universal function of y = kn 
during horizon-crossing in the radiation-dominated era, independent of k. The ISW integral yields a 
result that looks like the spherical Bessel function of the monopole contribution on the second line of 
equation (34.46), but with a different amplitude and phase. The net result is a power spectrum that 
again looks like the Sachs-Wolfe power spectrum, but with a (somewhat) different amplitude than the 
large-scale power spectrum, whose modes entered the horizon in the matter-dominated regime. 


34.7 Appendix: Integrals over spherical Bessel functions 


Two useful integrals over spherical Bessel functions are 


= dy 2 *V/nrT [3 (€4+2)] 
Up(z) = A 2 34.56a 
oo; dy 2737r (2 — 2T [S(€+ 2 +2)| 
Up. 1 = 1 Z = y 
LL (z) f je(y)je (y)y y DEGTET TE 43 aT Ee 043 z)| 
(34.56b) 
where T(z) is the Gamma function. The integrals satisfy the recurrence relations 
Ue(z) = (€-— 24 2) Up_i(z — 1) 
£-2+z 
= — a 4. 
re er 2(z) , (34.57a) 
£+0-2+z 
Use?) = agg Vee) 
_ {er = 2422 = 34-2) 
= 2—2) Uiw = 
_(@+L-2+z2z) (l-l -3+z2) 
~ (4242-20 =f 1 Ue-zw (2) « E 
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Cosmological perturbations including 
polarization 


Well before recombination, frequent collisions drive photons into thermodynamic equilibrium. In thermody- 
namic equilibrium, the photon distribution is unpolarized. But, as will be seen in §35.10, photons scattering 
off electrons become linearly polarized. The CMB bears the imprint of polarization generated near the surface 
of last scattering. 

Polarization produces distinct E-mode (electric parity) and B-mode (magnetic parity) fluctuations, §35.6.2. 
The B-mode fluctuation can be generated only by vector or tensor, not scalar, gravitational potential fluc- 
tuations. The B-mode polarization has opposite parity to, and can thereby be observationally distinguished 
from, the much stronger unpolarized and polarized scalar fluctuations. Thus the B-mode polarization pro- 
vides a clean window to gravitational waves generated during inflation in the very early Universe. A detection 
of B-mode polarization was initially claimed by the BICEP2 collaboration (Ade et al., 2014), but subsequent 
cross-comparison between BICEP2 and Planck data suggests that the detected polarization may have been 
a galactic foreground from dust aligned by the galactic magnetic field (Ade et al., 2015). If a cosmological 
signal of B-mode polarization is detected in the future, it would present a remarkable observation of physics 
at near-Planck energies far exceeding those accessible in earthly particle accelerators. 


35.1 Photon polarization 


Photons have spin one. They have two distinct spin eigenstates, or polarizations, transverse to the photon 
direction of motion. A general spin eigenstate of a photon is a complex linear combination of the two spin 
states. Any pair of transverse spin states can be chosen as a basis. If the photon direction of motion is along 
the 3-direction (z-direction), then the two basis spin states can for example be taken to be linear polarizations 
yı and ‘2 along the 1- and 2-directions (x- and y-directions) transverse to the 3-direction. An elegant choice 
of basis spin states are right- and left-circular polarizations y4 = (yı + iy2)/V2 and y- = (yı — iy2)/ V2, 
equations (39.1), in which the spin is respectively aligned (+) and anti-aligned (—) with the photon direction 
of motion y3. The condition of right- or left-circular polarization, aligned or anti-aligned with the direction 
of motion, is Lorentz invariant, unchanged by any Lorentz transformation. The general spin eigenstate of 
a photon is described by a complex polarization vector a, a complex linear combination of right- and 
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left-handed eigenstates, 
a = a Ya =a} +AT Y. (35.1) 


The polarization vector a is transverse, that is, it is orthogonal both to the time axis yo and to the direction 
‘v3 of the photon’s direction of motion, 


a:-yo=a-73=0. (35.2) 


According to the rules of quantum mechanics, the squared amplitude is the probability of the photon, 
which for a single photon is one, 


lat}? + ja“? =1. (35.3) 
The squared individual amplitudes |at|? and |a~ |? of the polarization vector (35.1) represent the probabilities 
of observing the photon to have polarization y+ or y_. For example, if a photon with polarization a is sent 
through a right-circularly polarized filter, then the photon will be transmitted with probability |at|?, and 
the transmitted photon will then be 100% right-circularly polarized. The total probability of the spin states 
of the photon is one, equation (35.3). There is a Lorentz-invariant operation of conjugation that leaves 
orthonormal vectors unchanged, equation (39.112), but flips the spin indices + © —, equation (39.113). The 


conjugate @ of the polarization vector is 


a=a*y, +a *y_=al*y_ +a “44. (35.4) 


The normalization condition (35.3) can then be written 
G-a=1. (35.5) 


More generally, if a photon has polarization a, then the probability Pg: of observing it to have polarization 
a’ is, by the rules of quantum mechanics, 


Py =|a’-al?. (35.6) 


A photon in a pure y+ eigenstate (i.e. with polarization vector e~*?y, where e~'? is some arbitrary phase 
factor) is right-circularly polarized, while a photon in a pure -y_ eigenstate is left-circularly polarized. A pho- 
ton that is a superposition of equal magnitudes of right- and left-circular polarizations is said to be linearly po- 
larized. For example a photon with polarization vector a; = e~'?(y,+y_)/V2 = e7'?y, is linearly polarized 
in the 1-direction (x-direction), while a photon with polarization vector ag = e~*?(y4—y_)/(W21) = e7'? 472 
is linearly polarized in the 2-direction (y-direction). More generally, a photon with polarization vector 


oni Bg ge 2 
V2 

is linearly polarized along a direction rotated right-handedly by angle x from the 1-axis. Polarization angles 

x = 0 and 7/2 correspond to photons linearly polarized along respectively the 1- and 2-directions. A polar- 

ization angle of x = 7 flips the sign of a,, equivalent to changing its phase ¢, so the polarization angle x is 

determined only modulo z. 


ay = 


(35.7) 
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The most general polarization vector of a photon is elliptically polarized, a superposition of unequal 
non-zero magnitudes of right and left polarizations, 


—ix iX a; 
m — eit cos € Y+ + e~ smey- 
X€ 4/2 


The elliptic angle € varies from € = 0 for pure right-circular polarization, to « = 7/4 for linear polarization, 


(35.8) 


to e = 7/2 for pure left-circular polarization. The polarization angle x is the angle by which the polarization 
ellipse is rotated right-handedly from the 1-axis. 

As is usual in quantum mechanics, the phase e~‘® of a polarization vector a is by itself unobservable; only 
probabilities (35.6) are observable. 


Concept question 35.1. Relation of the polarization vector to the electromagnetic potential. 
How is the polarization vector a related to the electromagnetic potential A? Answer. As discussed in 
§27.6, the gauge freedom of electromagnetism means that only 3 of the 4 components of the electromagnetic 
potential A are gauge-invariant, equations (27.37), and only the 2 vector (i.e. transverse) components A] of 
the electromagnetic potential describe propagating waves, equation (27.40). Plane-wave solutions propagating 
in the -y3-direction (z-direction) are functions of A(t — z) with A transverse. The associated electric and 
magnetic fields are, equations (27.38), 


E=-A, B=73A\E. (35.9) 


The electric and magnetic fields of a propagating wave are transverse and orthogonal to each other. In 
quantum field theory, §??, the electromagnetic potential A is fundamentally complex. Monochromatic waves 
of positive angular frequency w propagating forwards in time in the z-direction in Minkowski space are 
described by a complex potential 


A = Age~@E-4) , (35.10) 
where Ao is a constant complex transverse vector. The mean-squared potential is 
(A. A) = (Ao: Ao) = A”. (35.11) 


The polarization vector a, which has unit magnitude, equation (35.3), equals the constant potential Ag 
scaled to unit magnitude, modulo a possible phase factor, 
A 


a= aq mod phase . (35.12) 


In classical electromagnetism it is possible and conventional to work with real quantities only, since the 
phase of the electromagnetic potential A is by itself unobservable. That is, although Maxwell’s equations 
admit complex wave solutions such as (35.10), classical electromagnetism does not require them. In classical 
electromagnetism, a real wave is the real part of a complex wave (35.10), multiplied by V2 to get the 
mean-squared amplitude right. Thus in classical electromagnetism real linearly polarized waves oscillating 
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in respectively the 1- and 2-directions are 
A, = V2Aq Reet) = V2 Ay, cosw(t — 2) , (35.13) 
2 2 2 


while real right- and left-circularly polarized waves are 


A+ = V2AReq4e™™ -> = ARe((qi + iqa)e@)) = A(qı cosw(t — z) + yosinw(t—z)). (35.14) 


Note that the “complex conjugate” of Ym in this context strictly refers to the Lorentz-invariant conjugate 
Fm, equation (39.112), with respect to which yı and 2 are both real. For each of the linearly polarized 
waves (35.13), or circularly polarized waves (35.14), the mean-squared potential is 


(A?) = 2A?(cos’wt) = A?, (A4)=A?. (35.15) 
2 


Evidently, when dealing with polarization, it is simpler to work with complex (quantum mechanical) waves 
than with real (classical) waves. 


35.2 Photon density matrix 


It is necessary to distinguish between photons in mixed states and mixtures of photons in different states. For 
example, a system consisting of photons all in a linearly polarized state (35.7) is not the same as a mixture 
of purely right-handed and purely left-handed photons. The systems can be distinguished experimentally by 
passing the photons through polarizers. 

To deal with these distinctions, a statistical ensemble of photons in various polarization states must be 
described by a density matrix. Suppose that the system consists of photons in pure polarization states a 
with real occupation numbers f(a). Then the density matrix f may be defined by the tensor 


f= Š. fl@aga. (35.16) 
photons a 
In any basis ya, the density matrix is 
f= f NN, (35.17) 
ab 
with components 
(P= >. fae. (35.18) 
photons a 


A conjugated index b signifies the index of the conjugated vector, yg = Yp. For orthonormal vectors yı and 
2, conjugated vectors are themselves, so 1 = 1 and 2 = 2; while for chiral vectors y} and y_, conjugation 
flips spin, so + = — and — = +. The density matrix is Hermitian, 


(Fr aes (35.19) 
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Its trace is 


= SS Jey we] SM la), (35.20) 


a photons a photons a 


which counts the total number of photons. 
If the system of photons is measured along the polarization direction a, then the occupation number of 
photons with that polarization will be found to be, in accordance with equation (35.6), 


f(a) = fr aga; . (35.21) 


The conjugate f of the density matrix f is 


f= So flaaea= > f yarn, (35.22) 
photons a ab 
whose components are 
f= X fae (35.23) 
photons a 


35.2.1 Physical interpretation of the photon density matrix 


Since the complex 2 x 2 photon density matrix fÈ is Hermitian, equation (35.19), it is diagonalizable with 
2 real eigenvalues. The form (35.16) of the density matrix ensures that the matrix is positive definite, that 
is, its eigenvalues are both non-negative. If only one eigenvalue is positive, and the other is zero, then the 
density matrix represents a pure state. The most general pure state consists of photons all in the same (in 
general elliptically polarized) state. The most general impure state is equivalent to a mixture of photons in 
two orthogonal (in general elliptically polarized) states. In thermodynamic equilibrium, the two eigenvalues 
are equal, and the density matrix describes a mixture of equal numbers of photons in any pair of orthogonal 
states. 
With respect to a circularly polarized (Newman-Penrose) basis y+, the 2 x 2 density matrix f is 


a (F0 a) 
f (7. pt) (35.24) 


The components fae comprise two real scalar (spin 0) components f*~ and f~+, and a complex tensor 
(spin 2) component ftt = f——*. The trace of the density matrix (35.24) counts the total number of 
photons, equation (35.20). The unpolarized scalar occupation number f defined earlier, equation (31.28), 
equals one when there is one photon in either of the two polarization states, so the trace equals twice the 
unpolarized occupation number f, 


DY =J= YO faf. (35.25) 


photons a 
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Conjugation flips chiral indices + + —, so the components of the conjugate density matrix are 


fe ( D a ) (35.26) 


Note that conjugation here signifies the Lorentz-invariant operation described in §39.7.4, which is related to, 
but not the same as, complex conjugation. In particular, the components ft~ and fT* are conjugates of 
each other, but they are nevertheless real. 

The spin-0 component f*~ (the coefficient of y+ ®,) measures the intensity of right-circularly polarized 
light, while the other spin-O component f~* (the coefficient of y- & y_) measures the intensity of left- 
circularly polarized light. The sum 2f = f+~+f~* measures the total intensity of light in both polarizations, 
while the difference f+ — f—* measures the net circularly polarized intensity, the excess of right- over left- 


circular polarized intensities. 
The complex spin-2 component f+ measures the degree of linear polarization of the light. A photon 
linearly polarized in the direction y, equation (35.7), contributes a density matrix 


ay D Oy = 5 (9+ 8 Y- +Y- 8 V+) + Ge 1 H+ eg OY, (35.27) 


whose components are 
ee | 1 e~ 2x 
ab __ 
Y= 5( e2ix 1 ) (35.28) 


The trace of the density matrix is one, which is as it should be for a single photon. Twice the amplitude of 
f** gives the degree of linear polarization, which here is one (100% linearly polarized), while the phase 2x% 
of ft+ measures the angle y by which the direction of polarization is rotated right-handedly from the 1-axis 
(a-axis). 

In the cosmological case under consideration, Thomson scattering generates linear but not circular polar- 
ization. In this case the photon density matrix is 


a i ae ) 
= ; 35.29 

a ae are 
with f the unpolarized occupation number. The equality of the diagonal elements of the density matrix, 
ft- = f+ = f, expresses the absence of circular polarization. 


Concept question 35.2. Elliptically polarized light. Can a beam of elliptically polarized light be 
distinguished from a sum of beams of linearly polarized and circularly polarized light? Answer. Yes. A 
beam containing photons all in the same elliptically polarized state is in a pure state, which is not equivalent 
to any sum of beams of different polarizations. In a beam in a pure state, 100% of the photons will pass 
through a matched filter, whereas in a beam in a mixed state some photons will be passed through and some 
will be absorbed by the matched filter. 
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35.2.2 Relation to Stokes parameters 


The 4 components of the polarization density matrix fo are related to the 4 conventional real Stokes 
parameters I, V, Q, and U by 


2f =f +f t=], (35.30a) 
J-j '=/, (35.30b) 
ftt+f--=a@, (35.30c) 
fr -f-=w. (35.30d) 


The Stokes parameters here are normalized so that the total intensity J measures the total occupation 
number 2f, the trace of the density matrix. Stokes parameters can be normalized in other ways, whatever 
may be convenient. Some of the ways that astronomers normalize intensity are described in the paragraph 
containing equation (1.80). 


35.3 Temperature fluctuation for polarized photons 


Previously, the perturbation to the unpolarized scalar occupation number f was expressed in terms of 
the temperature fluctuation, © = 6T/T, equation (33.38). The temperature fluctuation © = Oya S Yp 
including polarization can be defined similarly in terms of the density matrix f2, equation (35.18), which 
is the generalization of the occupation number to include polarization, 
10) 
Jo = Ofy o% . 
OlnT 


(35.31) 


The trace of 0% is twice the scalar temperature fluctuation, )*,0%* = 20. The trace-free part of ee 
describes the polarized temperature fluctuation. 

It is conventional in cosmology, and elsewhere in physics, to signify the components of the photon density 
matrix using a spin index s, positioned to the left of the symbol to distinguish it from harmonic indices £m, 


090 = ot- , 9V= e-+ », 20= ett », .0=0 7. (35.32) 


In the cosmological situation being considered, Thomson scattering generates linear but not circular polar- 
ization, with the consequence that the two spin 0 components +99 are equal, and equal to the unpolarized 


temperature fluctuation O, 
00 = -0 =O. (35.33) 


The spin index s signifies how the polarized temperature fluctuation varies, and is to be distinguished from 
harmonic indices m. A temperature fluctuation ,O¢,,, equation (35.37), is the coefficient of an eigenmode 
that varies as Dems x e~**X under a right-handed rotation by angle y about the photon’s direction p 
of motion, and as Dems x e~””? under a right-handed rotation by angle ¢ about the direction k of the 
wavevector of the fluctuation. 
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35.4 Summary of equations including polarization 


This section summarizes the coupled Boltzmann and Einstein equations needed to compute linear cosmo- 
logical fluctuations including photon polarization. 

Polarization involves not only scalar (m = 0) but also vector (m = +1) and tensor (m = +2) fluctuations 
sQem- The hierarchy of Boltzmann and Einstein equations for different m are decoupled from each other, so 
scalar, vector, and tensor equations may be calculated separately. Symmetry between positive and negative 
m means that in practice equations need be solved only for positive m = 0, 1, and 2. Vector (m = 1) modes 
are commonly treated as being negligible, for the reasons given at the end of this section. Thomson scattering 
couples unpolarized Opm and electric polarized Eem photon multipoles, equations (35.68c) and (35.68d). The 
polarized photon Boltzmann equations (35.45b) and (35.45c) couple the electric Eem and magnetic Bem 
parts of the polarized multipoles. 

The Boltzmann equations for nonbaryonic dark matter and for baryons are equations (35.49) and (35.48), 
generalizing the scalar matter equations (33.1) and (33.2). 

The Boltzmann equations for polarized photons are given by equations (35.45), with gravitational redshift 
source terms Gem (not to be confused with the Einstein tensor) given by equations (35.41), and Thomson- 
scattering collision terms C[Ogm], C[Emm], and C|Bem] given by equations (35.68). These generalize the 
scalar Boltzmann equations (33.81) for unpolarized photons. 

The Boltzmann equations for neutrinos are equations (35.47), generalizing the scalar neutrino equa- 
tions (33.91). 

The Boltzmann hierarchies for photons and neutrinos may be truncated as described in §35.10.1. 

Scalar, vector, and tensor Einstein equations are equations (33.7), (35.52) and (35.53). 

Vector and tensor gravitational potentials W4 and h++ are in general complex (with W- = Wi and 
h__ = hï ņ). Linear vector and tensor fluctuations (of all species) are proportional to the initial amplitudes 
W4(0) and h++ (0). Therefore in numerical calculations the initial amplitudes W (0) and h4+(0) can be taken 
to be real, any phase factor being absorbed into a normalization factor. The phase factor cancels in power 
spectra, equation (36.25). If the initial amplitudes W+(0) and h4(0) are real, then the coupled Boltzmann 
and Einstein equations ensure that W+ and h4+ and the photon multipoles Om, Eem, and Bem remain real, 
as do matter and neutrino multipoles. Since the polarized photon multipoles Eem and Bem are real, it can 


be convenient numerically to combine them into the complex polarized multipoles 202, = Eem + iBem, and 
to solve a complex polarized Boltzmann equation whose left hand side is the complex expression (35.39). 
Thomson scattering couples the unpolarized fluctuation Oem only to the electric part, that is, the real part, 
of the polarized fluctuation 20g, = Eem +iBem. 

Collisions (before neutrino decoupling in the case of neutrinos, and before recombination in the case 
of photons) tend to drive initial vector (|m| = 1) and tensor (|m| = 2) multipoles of all particle species 
to zero, §35.11. Vector gravitational fluctuations W tend to redshift to zero, equation (29.51), so vector 
fluctuations of all species are expected to be negligible. On the other hand, tensor gravitational fluctuations 
h++ (gravitational waves) generated during inflation survive to low redshift, equation 29.53, and drive tensor 
fluctuations in collisionless relativistic species, first neutrinos, and then photons near and after recombination. 
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Exercise 35.3. Boltzmann code including polarization. Upgrade the code you wrote in Exercise 33.1 
to implement polarization. Read the summary section 35.4 above for guidance. 


35.5 Boltzmann equations for polarized photons 


Whereas the unpolarized occupation number f is a scalar, the polarized occupation number f?? is a tensor. 
The directed derivative m on the left hand side of the Boltzmann equation (33.8) should therefore be 
replaced by the covariant derivative Dm f, 


Dmf® = Ons +D imf” tend . (35.34) 


However, the polarized (trace-free) part of f is of linear order, and the tetrad-frame connections [apm with 
a, b both spatial are all of linear order, equation (29.23) (in any gauge), so the connection terms on the right 
hand side of equation (35.34) are of quadratic order and can be neglected. Consequently no additional terms 
depending on connections arise on the left hand side of the Boltzmann equation for the polarized photon 
distribution. 

The Boltzmann equation for the unpolarized photon distribution was given previously by equation (33.44) 
in conformal Newtonian gauge. The gravitational G term in this equation arises, equation (33.21), from the 
redshifting of photons in the unperturbed photon distribution f. Since the unperturbed photon distribution 
is unpolarized, the gravitational redshift terms contribute only to the unpolarized Boltzmann equation, not 
to the polarized Boltzmann equation. The unpolarized (spin-0) and polarized (spin-2) photon Boltzmann 
equations are thus 


Ò-— iku0 -G = C[E] |, (35.35a) 


E — iku 20 = Ch0] |. (35.35b) 


The collision terms C[,0] that arise from non-relativistic electron-photon (Thomson) scattering are calculated 
in §35.10. In conformal Newtonian gauge, and including not only scalar (m = 0) but also vector (|m]| = 1) 
and tensor (|m| = 2) potentials from Exercise 33.3, equation (33.26), the gravitational redshift term G in 
the unpolarized Boltzmann equation (35.35a) is 


G(n,k, P) = È + ikpW + iku p- W + pp’ har . (35.36) 


35.6 Spherical harmonics of the polarized photon distribution 


The spin-s component sO of the temperature fluctuation is naturally expanded in spin-s spherical harmonics 
sY¢m, §35.12 (Seljak and Zaldarriaga, 1997; Zaldarriaga and Seljak, 1997; Hu and White, 1997). With the 
normalization conventional in CMB studies, the harmonic expansion of the polarized temperature fluctuation 
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sO is, consistent with the expansion (33.47) of the scalar (m = 0) fluctuation ©, with k taken along the 
3-direction (z-direction), 


min(£,2) 


Ok B= >> JO (yAn + 1) sO em(n, k) -Yim (PX) 


£=|s| m=— min(£,2) 


=X) YS (D(H +1) Oem (0, k) Dems (4,8, X) 5 (35.37) 


£=|s| m=— min(£,2) 


where sYem (P, X) are the spin-weighted spherical harmonics defined by equation (35.84), and Dgms(@, 0, x) are 
the Wigner rotation matrices discussed in §35.12.2. The angles 0 and ¢ are the polar coordinates of the photon 
direction p. Modes ,O¢,, with |m| = 0, 1,2 correspond respectively to scalar, vector, and tensor fluctuations. 
The index on the scalar (m = 0) fluctuation is often omitted for brevity, ,O¢9 = ©. The orthogonality 
relation (35.143) implies that the spin harmonics ,Q¢,, are angular integrals of the temperature fluctuation 
sO over momentum directions p, generalizing equation (33.48), 


dop 


T (35.38) 


tm (nk) = i= J „Oln k, P, 0) Din l0, 0) 


The expansion (35.37) differs from the convention of Hu and White (1997) in that (a) the expansion is 
with respect to _,Y/*, as opposed to sYem, (b) there is an extra factor of (—i)’—*, (c) the spin harmonics 
-sY č% (Ð, X) include a factor of e~***. The point of expanding with respect to _,Y*, is that ,Q¢m is then the 
coefficient of the spin-weight s and m (rather than s and —m) term under rotations about respectively the p 
and k directions, consistent with the convention in this book that the spin-weight of an object can be read off 
from its covariant indices. The factor of (—i)™7® is introduced to cancel the factor of (—)™~* between Dems 
and its complex conjugate, equation (35.129), ensuring reality conditions (35.46) on the harmonic coefficients 
that match those on the Newman-Penrose components of the gravitational potentials. The factor of e~*** in 
-sY pn or Dems makes explicit the spin factor that Hu and White (1997) absorb into basis vectors Ya ® Y% of 
the polarization matrix. 


35.6.1 Boltzmann equations for spherical harmonics of the polarized photon 
distribution 


The action of u = cos @ = k- p on the spin harmonics follows from from the recursion formula (35.145) for the 


rotation matrices Demn. The resulting expression for the terms ,O—iky ¿© of the Boltzmann equation (35.35), 
common to all spins s and all harmonics ém, is 


Kems ism Kée+1,ms 


e(e@+1)° 2041 


(,0 — iku ©) Otim] , (35.39) 


lm 
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where the coefficients Kemn are given by equation (35.146). The harmonic expansion of the gravitational 
term G, equation (35.36), in the unpolarized Boltzmann equation is 


GURRE) =y Ne i) T™ (20 + 1)Gem (N, k)Demo($, 9) , (35.40) 


£=0 m=—£ 


with non-vanishing harmonics (do not confuse Gem here with the Einstein tensor) 


Goo = Ë , (35.4la 
k 
Gio=-3¥, (35.41b 
ee (35.41c 
Ve t 53 =) i 
2. 
NE Ie a, (35.41d 
5/3 
where Wi = a (We +%iW,) are the spin-weight +1 components of the vector perturbation Wa, equa- 
tion (27.22), and h44 = hag tihgy are the spin-weight +2 components of the tensor perturbation hab, 


equation (27.23). 


35.6.2 Electric and magnetic parts of the polarized photon distribution 


The Wigner rotation matrices Dems transform under a variety of discrete transformations. Of particular 
relevance here is one that flips the spin index s, which is accomplished by a parity transformation (35.130). 
Parity eigenstates of the rotation matrices are 


(1 ma P)Dems(, 0, x) = Dems(d, 0, x) T Dems( tem 0, —x) 


= Dens(; 0, xX) £ (—)’Dem,—s(¢, 0, x) : (35.42) 

The harmonics +,Q¢, of the spin +s fluctuation thus split into an “electric” part sEem of parity (—)’ and a 
“magnetic” part ¿Bem of opposite parity (—)**?, 

+sOem = sem +i sBom : (35.43) 


The names electric and magnetic come from the fact that the parity is the same as that of electric and 
magnetic multipole radiation; E and B here are unrelated to the electric and magnetic fields of the underlying 
electromagnetic radiation. There being no ambiguity, the spin index s is dropped on ,F and B for the spin 
+2 fluctuation, 


+20¢m = Em mz iBem ; (35.44) 


that is, Eem = 2Fem and Bem = 2Bem. The resolution of the polarized fluctuation into parity eigenstates is 
motivated by the fact that the gravitational redshift term G and Thomson scattering collision terms C[,0] 
are invariant under a parity transformation, so parity is an eigenstate of evolution of the polarized photon 


934 Cosmological perturbations including polarization 


distribution. As a consequence, the parity components of the temperature fluctuation satisfy the reality 
conditions (35.46). 

Resolved into parity eigenstates, the Boltzmann equations (35.35) for the unpolarized and polarized tem- 
perature fluctuations are 


‘ . x Kem Ke+1,m 
(O = ikuO) pn = Orm +k & er Ov_-1.m = LF Osm) = Gem ar C(O em] ’ (35.45a) 
Kem2 2m Ke+1,m2 
E — E Eim tk Etim Bem L Etim | = ClEem| , .45b 
(B= thy), = Bim +k (SE Beam — EO Bim E Eyam) =C [Em] » (85.450) 
— Kem2 2m Ke+1,m2 
B —ikuB = Bon + k | ——Br_-imd Eom =~ Bizim | =C|Bem| , A 
( iku P lm + (e 0-1 FD e apq 1 PL ) C[Bem| (35.45c) 


with coefficients Kemn given by equation (35.146). The azimuthal index m runs over scalar (m = 0), vector 


(m = +1), and tensor (m = +2) modes. Do not confuse the azimuthal index m with spin s: the unpolarized 
temperature fluctuation Opm = gOegm is spin 0, while the polarized temperature fluctuations Eem = 2E em 
and Bem = 2Bem are spin 2. The harmonic number £ must be greater than or equal to both m and s, so £ 
runs from |m] to co for Ogm, and from 2 to oo for Eem and Bem. When combined with the Einstein equations, 
§35.9, the Boltzmann equations (35.45) imply the reality conditions (35.46), which among other things imply 
that scalar B-modes vanish identically, Beo = 0. 


Concept question 35.4. E and B modes versus Stokes parameters. Since 20 = E + iB, aren’t E 
and B (up to a factor) the same as the Stokes parameters Q and U in 20 x ftt x Q+iU, equation (35.30)? 
Answer. No. In ¿© = sE +7,B it is necessary to distinguish the two spins s = 2 and s = —2. The two sets 
of opposite spin s = +2 are expansions in eigenfunctions Dems of opposite spin s. In other words, 2F is not 


the same as _2F because the eigenfunctions Dem2 and Dem,—2 are not the same, even though the coefficients 
sEgm are the same for s = £2. 


35.6.3 Reality conditions on the polarized photon distribution 


The initial photon distribution well before recombination is in thermodynamic equilibrium and therefore 
unpolarized. The Einstein scalar (33.7), vector (35.52), and tensor (35.53) equations show that the scalar Y 
and ®, vector W4, and tensor h4+ gravitational potentials are sourced by unpolarized (s = 0) temperature 


multipoles Om with respectively |m| = 0, 1, and 2 (and |m| < £ < 2). The unpolarized temperature multi- 
poles Om with |m] = 0, 1, and 2 are in turn sourced by gravitational redshift terms Gem, equations (35.41). 
Modes with different m (= —2, —1,0, 1,2) are decoupled: gravitational modes of given m can generate only 
temperature fluctuations of the same m, and vice versa. 

Thomson scattering generates spin 2 electric quadrupole polarization Esm from unpolarized quadrupole 
multipoles ©2m, equation (35.68d). The polarized Boltzmann hierarchy (35.45) then feeds £ > 3 electric Eem 
and, for m Æ 0, magnetic Bem multipoles with the same m. 

The scalar potentials Ų and © are real, while the Newman-Penrose components W and hi. components 
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of the vector and tensor potentials are complex, satisfying W% = W_ and hi, = h__. The Einstein and 
Boltzmann equations then imply the reality conditions 


Ofm =Se—-m, Em = Eom, Bim =—Be-m- (35.46) 


In particular, all scalar (m = 0) fluctuations are real. The scalar magnetic fluctuation vanishes, Beo = 0. 
The multipoles Oem, Eem, and Bem are complex for m Æ 0. 

As remarked in §35.4, without loss of generality the initial gravitational potentials W+(0) and h4+(0) 
can be taken to be real by absorbing a complex phase factor into their normalization (the phase factor for 


negative m is the complex conjugate of the phase factor for positive m; and the phase factor is different 
for different m and/or wavevector k). All linear fluctuations are proportional to the same phase factor. The 
phase factor cancels in power spectra, equation (36.25). If the initial gravitational potentials are real, then 
the Einstein and Boltzmann equations preserve that reality, so that all multipoles, including the gravitational 
potentials, the photon multipoles Om, Eem, and Bem, and matter and neutrino multipoles, are real. 


Concept question 35.5. Fluctuations with |m| > 3? Are there fluctuations with |m| > 3? Answer. 
No, because there are no gravitational potentials with |m| > 3. Well before recombination in the case of 
photons, or well before electron-positron annihilation in the case of neutrinos, collisions drive the distribution 
into thermodynamic equilibrium, characterized only by its first two moments, the monopole and dipole, or 
equivalently the density and bulk velocity. The monopole (£ = 0) admits m = 0, while the dipole (¢ = 1) 
admits m = 0 or m = +1. Later, free streaming allows higher multipoles (¢ > 2) to develop, but symmetry 


about the wavevector direction & ensures that the azimuthal mode m remains unchanged. Gravity supports 


scalar (m = 0), vector (m = +1), and tensor (m = +2) modes, and these source photon or neutrino 
multipoles of the same m, equations (35.41). Thomson scattering sources polarized fluctuations, but leaves 
the azimuthal mode m remains unchanged. 


35.7 Neutrino Boltzmann equations 


Vector and tensor Einstein equations (35.52) and (35.53) are sourced by neutrinos as well as photons. 
Relativistic neutrinos satisfy a set of Boltzmann equations similar to the unpolarized photon Boltzmann 
equations (35.45a) but without scattering terms, 


š : : i Kemo z Ke+1,m0 = 
(N ikp®) pn = Nom T k (# ie pent DASE i Nessim) Gem ; (35.47) 


where u = k- p is the cosine of the angle between the wavevector k and the neutrino momentum p. Here Gem 
are the harmonics (35.41) of the gravitational term G in the Boltzmann equation, the same as for photons. 


Equations (35.47) include not only scalar (m = 0) but also vector (m = +1) and tensor (m = 2) equations. 
The scalar equations are the same as before, equations (33.91). 
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Concept question 35.6. Are neutrinos polarized? Relativistic neutrinos are purely left-handed, spin 
antialigned with their direction of motion. If neutrinos are pure left-polarized, should they not be treated 
using a polarized density matrix? Answer. A pure circularly polarized distribution is in a pure state, not 
a mixed state, and is described by the spin-weight s = 0 (not s = +2) component f*~ of the polarization 
density matrix, §35.2.1. Gravity (in the present case, the gravitational redshift term G) is invariant under 
a parity transformation, and affects left- and right-handed spin states the same. The collisionless neutrino 


Boltzmann equation is a spin 0 equation. 


35.8 Matter Boltzmann equations 


Matter Boltzmann equations contain vector (m = +1) as well as scalar (m = 0) parts. Matter sources con- 
tribute to the vector Einstein equations (35.52). The scalar equations are the same as before, equations (33.1) 
and (33.2). The Boltzmann equations for nonbaryonic cold dark matter including scalar and vector parts are 


bo — kve =36 (m=0), (35.48a) 
Vom + ‘ Vem =0 (m=0,+1). (35.48b) 
The Boltzmann equations for baryonic matter including scalar and vector parts are 
dbp —k vp =36 (m=O), (35.49a) 
Vb,m + vom = 7 (Vbm — 301m) (m =0, +1). (35.49b) 


35.9 Vector and tensor Einstein equations 


The photon and neutrino energy-momenta T* depends only on the unpolarized photon and neutrino dis- 
tributions © and M. The scalar components of the photon energy-momenta were given previously by equa- 
tions (33.53). The vector components of the photon energy-momenta are given in terms of unpolarized 
multipole moments Opm by, from equation (33.51) with integrals over © being converted to harmonics Oem 
using equations (35.38), 


TOF = -To = 4p 01,44 , (35.50a) 


T?F = T34 = —i— p@ 2,41 , (35.50b) 


while the tensor components are 


TFF = Tae = = pOo4o. (35.51) 
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Massless neutrinos satisfy a similar set of equations. 
The scalar Einstein equations were given previously by equations (33.7). The vector Einstein equations 
are, from equations (29.50), 


—k?° W4 = —167Ga? (DeVe,+ + PoVb,+ + 4p,O1 41 + ApyN4,41) 5 (35.52a) 
ð å 64 5 z 
Kan +2 2) Ws = we (py 02,41 + pv No,+1) ’ (35.52b) 


where vi = Ze + i vy) are the spin-weight +1 components of the bulk velocity of a species. Only one of 
the two equations (35.52) is needed; the other is satisfied automatically as long as total energy-momentum 
is conserved. The tensor Einstein equations are, from equation (29.52), 


_ 32/2 
v3 


(Zee +e)h 


nGa? (pO2,42 + PyNo,42) - (35.53) 


35.10 Polarized Thomson scattering 


The invariant mean amplitude squared (| M|?) for electron-photon scattering by non-relativistic electrons 
with random spins, in which the initial photon polarization state is a (not to be confused with cosmic scale 
factor a) and the final polarization state a’, is, from equation (??), generalizing the unpolarized expres- 
sion (33.55), 


(\M|?) = (87a)?|a’ - al? , (35.54) 


where a = e”/(hc) is the fine-structure constant. The mean amplitude squared (35.54) is averaged over initial 
electron spins but not over initial photon spins, since here the initial photon polarization a is being specified. 
The adjective “mean” refers to the averaging over initial electron spins. The differential cross-section doy /do’, 
equation (??), for polarized Thomson scattering is related to the mean amplitude squared (|M|?) by, in units 
c=ħ= 1, 


do’ (BTM)? m2 


d 2 2 3 
or _ (M) _ a la’ ars Zor |ā' - af? , (35.55) 


where re = e? /mec? is the classical electron radius, and or = (87/3)r? is the total Thomson cross-section. 
The collision integral C[O] for unpolarized Thomson scattering was given previously by equation (33.74). 
The same equation holds for polarized scattering, except that the scalar temperature fluctuation O, equa- 
tion (33.47), is replaced by the polarized temperature fluctuations ,O, equation (35.37), and the Thomson 
scattering matrix (|M|?| becomes a matrix that couples the different spins s. 
The polarized Thomson scattering matrix (|M|?) is not diagonal in a spin (circularly polarized) basis y+ 


(see equation (35.60)), but it is diagonal with respect to a linearly polarized basis Yx, yy in a frame where the 
momentum p of the incoming photon is along the z-direction, and the momentum p’ of the scattered photon 
is in the z-z plane, as illustrated in Figure 35.1. In this special frame, the polarized Thomson scattering 
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Figure 35.1 Polarized light incident in the z direction (wiggly blue line) on an electron causes the electron to oscillate 
in the direction a of polarization (red arrow). The oscillating electron emits scattered light at the same frequency 
(wiggly blue line). For incident light with polarization vector az in the scattering plane (left), the polarization vector 
a,’ of the scattered light is rotated by the scattering angle 7, and reduced in amplitude by a factor cos %, so that 
az - a, = cosy. On the other hand, for incident light with polarization vector a, orthogonal to the scattering plane 


(right), the polarization vector a, of the scattered light is the same as that of the incident light, so that ay: ay = 1. 
matrix (M|?) is, equation (??), 
cos?y 0 0 0 
b 0 cos yY 0 0 
MIP ny = (8ra)? 35.56 
[My], P 9 o wa (35.56) 
0 0 0 1 


where w is the scattering angle. The ordering of rows ab and columns a’b’ here is xx, xy, yx, yy. Note 
that up to this point the second of the two polarization indices ab has been written as a conjugate b (as a 
reminder that the diagonal components ++ = +— and —— = —+ transform as spin 0, while the off-diagonal 
components +— = ++ and —+ = transform as spin 2), but here the conjugate symbol on the x, y indices 
can be dropped because the conjugates of orthonormal vectors are themselves, Y, = Yz and 7, = Yy- 

The collision integral (33.74) generalizes to 


abf a Nea ab a a abra albira 

Clo" ®,x)] = n / IM?) aw |(@ = P’) va = 9°" (P, X) +0" @,x)| 5 —- (35.57) 
The baryon bulk velocity vp term in the integrand comes from a difference in the unperturbed, unpolarized 
photon distribution, the first terms on the right hand side of equation (33.65), so the integral over the baryon 
velocity term yields the same result as in the unpolarized case. The term ot (p) is independent of p’, and can 
be taken outside the integral. The collision integral (35.57) thus reduces by the same manipulations (33.75)— 


35.10 Polarized Thomson scattering 939 
(33.78) as in the unpolarized case to 


3 


ab/ a . a a abf a E ab a'b' fa 
Cle” (p, x)] = Hava 0 _@ CRESE [|a’ - al?] fn O° (BX) 


do! dy’ 
- ° X} , (35.58) 


An In 
generalizing the unpolarized collision integral (33.78). The Kronecker delta 6°° is to be interpreted as 


equal to 1 for the unpolarized collision term C[®], and zero otherwise. In the special frame aligned with the 
scattering plane, the integrand on the right hand side of the collision integral (35.58) is, from equation (35.56), 


cosy 0 0 0 er 

3 p gab Qatar _ 3 0 cos Y% 0 0 ory 
2 [la’ -al Jaw ce a 2 0 0 cosy 0 ove VARR 

0 0 0 1 9Y 


Since ©% is Hermitian, 07” and ©% are real, but 07” may be complex, with ©7%* = ©¥7, Well before 
recombination, frequent collisions drive the photons into thermodynamic equilibrium, so the photon distri- 
bution is initially unpolarized, with 0** = 04% and O*Y = 0. Equation (35.59) shows that if light incident in 
a given direction is initially unpolarized (9% isotropic, proportional to the unit matrix), then the scattered 
light will be polarized (6% anisotropic). But if O*Y is initially real, it remains real after a scattering event. 
Since the imaginary part of OY is associated with circular polarization, Thomson scattering generates linear 
polarization, but not circular polarization. The reality of O7” means that O% = 0%, so OY is redundant, 
and may be dropped. 

In Newman-Penrose components, the absence of circularly polarized light implies that @T~ = O~* = Ọ, 
where © is the unpolarized temperature fluctuation. In Newman-Penrose components, equation (35.59) 
becomes 


r ; é 1+cos*y  —sin74 —tsin*y © 
[|a’ - al?) 7 O°? (P) = i —sin’y  4(1+cosp) 4(1-— cosy) ett 
—sin’y  4(1—cosp) $(1+ cosy) e-- 
; Š dooo + $ d200 Vi dzz Vi dzo-2 o 
5 — 1/3 dazo d222 d22—2 or ’ (35.60) 
e-- 
= 1/3 do- d2—22 d2—2—2 
where the functions demn(Y) are the polar part of the Wigner rotation matrix, equation (35.125). The pairs 
ab and a’b’ of indices in equations (35.60) run over +—, ++, and ——. Equation (35.60) can be written 
3 s 1 
3 [ja - aP]? sO = O dso +X (7i) cssdzss (Y) sO , (35.61) 


s! 


with s running over 0,2,—2 and s’ summed over 0,2,—2. The first term on the right hand side of equa- 
tion (35.61) is the unpolarized contribution, while the remainder is the polarized contribution. The coeffi- 
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Figure 35.2 Angles between photon momentum Ð, scattered photon momentum p’, and wavevector k. 


cients Css encapsulate the polarization structure of Thomson scattering, 


1 3 3 
2 8 8 
3 3 


Css! = = 


(35.62) 


w 
NI% NI 
NIW N| 


The coefficients depend only on the absolute value of the spins, Css’ = C)s\)5’|- 
The addition theorem (35.152) allows the rotation matrix dzss’ (%9) from the p’ frame into the p frame to 
be written as a product of rotation matrices from the p’ frame into the k frame into the p frame, 


2 
[la’- al]? OP, x’) = OB’) 620 +Y (i) css OB, X') YO Doms ($,9,X) Dams (00, X) - 


m=—2 


No] 


(35.63) 
Figure 35.2 illustrates the various angles involved in transforming from the scattering frame to a frame where 
the wavevector k is along the z-axis. 

When O(P’, x’) in equation (35.63) is expanded in rotation matrices, equation (35.37), the orthogonality 
of the rotation matrices, equation (35.143), makes the integration over directions p’ and x’ straightforward, 
yielding 

7 TA 2 
j; [|a’ - al?) sog, xX = @po0 5s0 + 5 (=)? Doms (H, 0, X) X Cas’ Oom - (35.64) 


m=—2 s’ 


Equation (35.64) shows that Thomson scattering changes s (generates polarization), but preserves the scalar- 
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vector-tensor index m. The sum over css in equation (35.64) is 
5 Css! s'O2m = C5098 2m + 2¢s2 Lom = Cs0(Oom + V6 Eom) . (35.65) 


The collision integral (35.58) is then 


2 
C[sO(, x)] = |+| |Ê- vb dso — sO(B) + 00580 + cso XO (=i)? ™™® Dams ($, 0, X) (O2m + V6 Ezm) 


m=—2 
(35.66) 
Expanded in harmonics, the collision integral is 
ee) £ 
CLO] = J J (E+ 1)C[sO tm] Dems($,9,X) 5 (35.67) 
€=|s| m=—£ 
with collision terms for the individual harmonics being 
C[®o0] =0, (35.68a 
1 
C[Oim] = -|F [1m 3¥bm| ’ (35.68b 
1 
C Om =—|F [oam = 79 (O2m + V6 Bam)| ry (35.68c 
6 
Cl Lam] = -|t | Bam z ein + V6 Bom)| , (35.68d 
C Bom ==|T Bom ; (35.68e 


Scalar, vector, and tensor modes correspond to those with respectively m = 0, +1, and 2. 


Exercise 35.7. Photon diffusion including polarization. A diffusion approximation for the photon 
quadrupole fluctuation O2 is obtained by neglecting time derivatives, Ò = 0, and higher order multipoles, 
03 = 0, in the Boltzmann equation for ©2. Without polarization, this led to the quadrupole (32.67) in the 
unpolarized Boltzmann equation (33.81c). Derive the diffusion approximation for the photon quadrupole ©2 
taking into account polarization. 

Solution. With polarization, the Boltzmann equations for the quadrupole scalar (¢m = 20) unpolarized 
and polarized fluctuations © and E> are coupled to each other by Thomson-scattering collision terms, 
equations (35.68c) and (35.68d). The Boltzmann equations are (the m = 0 subscript on Og, and Eem is 
dropped in accordance with the standard convention) 


O2+ li (201 — 303) = —|ż] [o> = (O2 + V6 B2)| ; (35.69a) 
. k ; v6 
Éa- Bs = IH |E» (02 + v6 E2)| (35.69b) 
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The diffusion approximation amounts to setting time derivatives to zero, Ò = Ey = 0, and higher order 
multipoles to zero, 03 = E3 = 0, which reduces equations (35.69) to 


ee, = —|7| [o> 5 (Əz + v6 E2)| l (35.70a) 
0=—|7| |E» = Wee, + v6 E2)| . (35.70b) 


The equation (35.70b) for E> implies that 


Ey = Eo . (35.71) 


Inserting this into the equation (35.70a) for ©2 implies 


8k 
O02 = — O1. 35.72 
= -ja (35.72) 
This looks like the earlier unpolarized estimate (32.67), except that the earlier factor $ is replaced by the 
factor Å, The revised diffusion coefficient changes the factor 8 to is in the photon-baryon momentum 


conservation equations (32.74)—(32.76). 


35.10.1 Truncating the polarized Boltzmann hierarchy 


As in the unpolarized case, §33.10.1, photons are tightly coupled to baryons by scattering well before recom- 
bination, and stream freely well after recombination. 

Prior to recombination, when |7| is large, keeping only the dominant ,02,, term in the Boltzmann hierar- 
chy (35.45) implies the tight-couipling approximation, generalizing the unpolarized equation (33.83), 


Oem Bee Oram (C23), (35.73) 
which holds for both unpolarized (s = 0) and polarized (s = 2) multipoles. 

Conversely, multipoles ,0,,,, in the free-streaming limit are obtained, similarly to the unpolarized case 
§34.6.1, from solution of the polarized radiative transfer equations (36.14). The radiative transfer equa- 
tions (36.14) involve unpolarized and polarized spin spherical Bessel functions jemm and €¢2m + i Beam = 
je2m2 = je22m. The recurrence (35.162) implies that the unpolarized and polarized spin spherical Bessel 
functions satisfy, generalizing equation (34.47), 


Ké+1,m0_ - 2+1. Kem 


O° a 
Zima iitth y ” tml (¢>m > 0) (35.74a) 
Ke+1,m2 . 1 im ; Kem2 . 
m = (20+1 m — — Je-1.22m (£> 3). 35.74b 
04.3 Jett22m = ( ) F T+ 5 Je22m — g pJe-1,22m (£2 3) ( ) 


Corresponding linear combinations of multipoles in the radiative transfer equations (36.14) yield an inte- 
gral similar to that on the right hand side of the neutrino equation (34.48); the integral is small in the 
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free-streaming limit. The result is the free-streaming approximation for unpolarized and polarized photon 
multipoles (note y > —kn), generalizing equation (33.84), 


Ke41,m0 2€+1 Kemo 

Op hin Cedi k= Bp 4 alk) , 35.75 

T+ m] +104 (n, k) m (n, k) pep (n, k) (35.75a) 
Ke+1,m2 1 im Kem2 

MAUR Opa m(n, k) = (U41) | — | sk — Opi m(n, k) . 35.75b 
EEE Orga m(nk) © = (204 D | Et pe Oet k) = FE OrionM k). (88.750) 


Normally the Boltzmann equations would be truncated at a suitably large harmonic number £, but if the 
equations are truncated at small Z (for example, ¢ = 1 for unpolarized scalar fluctuations, m = s = 0, yields 
the hydrodynamic approximation, §32.2), then unpolarized multipoles Ogm with £ = |m| and m = 0 or +1 
in equations (35.75a) should be replaced by O99 + Ooo + VW and O1,41 > O1,41 + sWe. 

Approximations similar to the unpolarized free-streaming approximation (35.75a) hold also for neutrino 
multipoles Mm, generalizing the scalar (m = 0) free-streaming approximation (33.92). 


35.11 Initial conditions for vector and tensor fluctuations 


Collisions tend to isotropize particle distributions, leaving only the monopole moment ém = 00 finite. In the 
particular case of the dipole, £ = 1, the Boltzmann equation (30.11b) contains a redshift term proportional to 
(1 — 3w)a@/a that drives the velocity to decay as v x a®”~!. The redshift term drives the velocity of massive 
species, w = 0, to decay as v x a~!. The redshift term vanishes for relativistic species, w = L, but drag from 
collisions with massive species still causes the velocity of relativistic species to decay. Thanks to collisions, 
the vector and tensor fluctuations of all particle species were initially close to zero. Although neutrinos are 
presently collisionless, they were collisional prior to neutrino decoupling, and were isotropized at that time. 

In the absence of a vector source, the vector Einstein equation (29.50a) forces the vector potential W, to 
vanish, 


Wa =0. (35.76) 


With no vector gravitational potential, there is no potential to drive vector multipoles of particle species 
away from their initial zero values. Thus all vector components of all species should remain essentially zero. 
This conclusion applies only to scales where fluctuations are linear: at nonlinear scales, stream-crossing and 
collapse generate non-zero vector components (rotations) (Hahn, Angulo, and Abel, 2015). See Exercise 35.9 
for more. 


In contrast to the vector potential, the tensor gravitational potential h++ in the absence of sources has 
a mode that remains constant outside the horizon, equation (29.53). This tensor gravitational potential 
drives tensor multipoles of collisionless species such as neutrinos, and also photons after recombination. 
Exercise 35.10 explores the initial evolution of tensor multipoles of neutrinos. 
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Exercise 35.8. Generic behaviour of scalar, vector, and tensor fluctuations of neutrinos. This 
exercise generalizes Exercise 32.7 to the case of vector (m = +1) and tensor (m = £2) fluctuations of massless 
neutrinos. Start with the two lowest non-vanishing Boltzmann equations (35.47), those for Mg +m with £ = |m| 
and |m|+1, and eliminate the multipole with £ = |m|+2 using the free-streaming approximation (35.75a). 
Conclude that, generalizing equation (32.91), 


ad 22d 
(4 ios r?) (No — ©) = —h2(U + 8) , (35.77a) 
oe 420 5 k2 
(Z + 7 ôn + k ) Ni, +1 -z W+ ’ (35.77b) 
2 620 V2 V2 k2 
4+ = +1) (~ SAh ) =— hos. 35.77 
(a non” a aa 5/3 es) 


Equations (35.77) are forced, damped wave equations with effective sound speed equal to the speed of light. 
Generically, neutrinos are decaying waves in which: 

1. Scalar: No — ® oscillates about —(W + ®); 

2. Vector: Ni,+1 oscillates about —;W4; 

3. Tensor: M2,+2 — Lh oscillates about — ah 
These conclusions hold for any relativistic, Ea a particles, so apply also to photons after recom- 
bination. 


Exercise 35.9. Initial evolution of vector fluctuations of neutrinos. Show that neutrinos do not 
naturally develop vector fluctuations. 

Solution. Vector potentials W are different from scalar or tensor potentials. Scalar and tensor potentials 
W+0 and A can and generically do have non-zero constant initial values well outside the horizon, 
kn < 1. Scalar potentials can have non-zero initial values because they are sourced by non-zero initial scalar 
overdensities ©ọ and No, equations (33.98). Tensor potentials can have non-zero initial values even if there 
are zero initial tensor sources O2 +2 and N2,+2, Exercise 35.10. But vector potentials W are constrained by 
the Einstein equation (35.52a), which in standard cosmology precludes the development of a non-zero vector 
potential from an initially vanishing vector source. In the radiation-dominated regime following neutrino 
decoupling, Thomson scattering tends to isotropize radiation, so neutrinos are expected to be the dominant 
vector source on the right hand side of the Einstein equation (35.52a). With only neutrinos sourcing W4 in 
the Einstein equation (35.52a), the approximate neutrino Boltzmann equation (35.77b) becomes 


8f 
+1 = -HE Mian ; (35.78) 


in which the final expression holds in the radiation-dominated regime, where a œ 7. The k? term is negligible 
well outside the horizon, kn < 1. Equation (35.78) then has solutions that are power laws Mi +1 « 7%, but 
for positive neutrino fraction, fy > 0, there are no solutions for which the index q has a non-negative real 
part. So there are no solutions in which Mi +1 is initially zero or finite (as opposed to divergent). 
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Exercise 35.10. Initial evolution of tensor fluctuations of neutrinos. Derive how tensor (m = +2) 
neutrino multipoles evolve initially in response to gravitational waves from the early Universe, that is, to 
a tensor gravitational potential h4+. This is a generalization of Exercise 33.5, which addressed the initial 
evolution of scalar fluctuations of neutrinos. 


Solution. The Boltzmann equations for neutrinos are equations (35.47). Prior to neutrino decoupling, col- 
lisions drive all tensor multipoles to zero, N¢+2 = 0. After decoupling, neutrinos stream freely, and the 
gravitational tensor potential h++ drives the lowest order tensor multipole, 4 = 2, away from zero. Lower 
order multipoles then drive the higher multipoles, so that the equations reduce to the form No+2 x Ne—1,+2- 
The Boltzmann hierarchy (35.47) reduces to, with y = kn, 


ANo42 _ V2 dh (35.79a) 
dy 5V3 dy . 

d a E 

No, +2 _ ED Al (€>3). (35.79b) 


dy 22+1 


Well outside the horizon, y < 1, the gravitational potential his is constant, equation (29.53), while all 
neutrino multipoles, including the lowest multipole M2,+2, are zero. With the initial condition M2,+2(0) = 0, 
equation (35.79a) solves to 


v2 


aN jh 
Note eye 


(y) — h++(0)] . (35.80) 


The initial (y < 1) evolution of the gravitational tensor potential depends on the equation of state w of the 
background energy-momentum, equation (29.57), and is 


y? 


h n Jan 


(35.81) 


with n given in terms of w by equation (29.58). Therefore the ¢ = 2 neutrino moment evolves initially as 
No,42 X y?, from equation (35.80), 


h++(0) 
10V6(1+n) ` 


N2 +2 x -N y? $ NO) = (35.82) 


The Boltzmann equations (35.79b) then imply that the initial (y < 1) behaviour of the neutrino tensor 
multipoles in general is 


(€—2)(€+2)! 215! 


Neto = 4 ULF pil y) 


WOO (>22). (35.83) 
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35.12 Appendix: Spin-weighted spherical harmonics 


Spherical harmonics Yp,,(0,¢) are simultaneous eigenfunctions of the squared total angular momentum op- 
erator L? and its component L, along some direction 2. They arise as eigenfunctions of the wave operator 
when separated in spherical coordinates. 

Spin contributes to angular momentum. When wave equations for fields of non-zero spin are separated 
in a spherically symmetric space, the resulting angular eigenfunctions are the spin-weighted spherical 
harmonics, denoted ,Yn(9, ¢, x). The spin-weighted spherical harmonics ,Y¢,,(0, 6, X) are defined in terms 
of the Wigner rotation matrix Demn(¢,9, X) discussed in §35.12.2 by 


[AF1 
sYem(0, Q, x) = qe Pim-s(:0,x) $ (35.84) 


The usual spherical harmonics equal the spin-weighted harmonics with zero spin, Yem = oYem. The reason 
for complex conjugation and the sign flip of the spin index s on the right hand side of equation (35.84) is 
that conventionally Ym œ e’?~*8X whereas Dems x e7#™9-isx, The convention for the Wigner matrix, 
which treats the angles ¢ and x symmetrically, is more natural than the convention for the spin-weighted 
spherical harmonics. In this book the temperature fluctuations ¿Ogm are coefficients of an expansion in 
Wigner functions Dems, equation (35.37), rather than in spin-weighted spherical harmonics. 

In the cosmological literature, the spin factor e~*** in the spin harmonics is often omitted, being absorbed 
in the case of photons into the behaviour of the polarization density matrix. The spin harmonics with spin 
factor suppressed are abbreviated 


sYem(9, Q) = sYem(9, ¢,0) . (35.85) 


35.12.1 Wigner rotation matrix 


The full 3-dimensional rotation group is the orthogonal group O(3), or, when extended to objects of half- 
integral spin, its covering group SU(2). The eigenfunctions of O(3) or SU(2) are the elements Denim of the 
Wigner rotation matrix. 
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The Wigner rotation matrix Dem'm(X', Y, x) is defined to be the matrix element between harmonics 
Yem(m) in one frame and harmonics Y;;,,,(n’) in a frame rotated by Euler angles x’, w, X, 


Yön (1) Yem(n) do = ied So (Demm (X, v, XY vm (2) “Yorn (1) do = 50 Dimm X, PX) (35.86) 


Bere Dimm X85) = | Yim DX, bY eml) do = | Yim Y Demm (C0) (0) do= | Yén 


m” 


(35.87) 
The quantum numbers £, m’, and m must be either all integral or all half-integral, and must exceed both 
[m| and |m], 


L> |m'|,|m| . (35.88) 


Equivalently, the spherical harmonics in the unrotated and rotated frames are related by 


Yom (n) = D(X, V, X) Yem (n -> Demir (X, V, X) Yem (n) , (35.89a) 


m'=— 


Yem(n 73 Dirntm (Xs X)Yemi (1) (35.89b) 
m'=—£ 
Notice that spherical harmonics rotate into linear combinations of harmonics of the same harmonic number 
£, which is true because rotation leaves the total angular momentum L? unchanged. The Euler angles x’, 
w, X in equation (35.122) correspond to a right-handed rotation of the unit vector n’ by angle x’ about the 
z-axis, followed by a right-handed rotation by angle w about the y-axis, followed by a right-handed rotation 
by angle x about the z-axis, 


Ny cosy siny 0 cosy 0 —sinw cosy’ siny 0 ni, 
ny |= { -sinx cosx 0 0 1 0 —siny’ cosy 0 ny (35.90) 
Nz 0 0 1 siny 0 cose 0 0 1 n! 


The generator of an infinitesimal rotation about an axis n is —in - L, and the operator corresponding to a 
finite rotation by angle x about direction n is exp(—ix n- L). Thus the operator D(x’, Y, X) that generates 
a rotation by the 3 Euler angles is 


Dae ra (35.91) 
The spherical harmonic components of the rotation operator are correspondingly (no sum over m’, m) 
Dane, p, x) = e™™Xdemm (Y) & Ms, (35.92) 


The matrix demm(W) is the polar part of the full rotation matrix Dgmm(x’, Y, x). The polar rotation matrix 
demm (Y) is a real matrix, orthogonal with respect to m'm, with matrix inverse 


demm)! = dem'm(—-W) = demm’ (y) = de,—m!,—m(w) = (—)" "demm (Y) : (35.93) 


n)Yom(n 
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A parity transformation Y% — m — ~ flips the sign of one of the indices m’ or m and multiplies by (—)’~™ or 
(ee, 
dem'm(™ — Y) = (—) de, -m mW) = (=) dem?,—m() - (35.94) 
The matrix inverse of the Wigner rotation matrix is its Hermitian conjugate, its complex conjugate transpose, 
Demin 10, X) = Deimm( =X, V, =X) = Dimm (X V: X) - (35.95) 
Complex conjugation flips the signs of m’ and m, and multiplies by (ee 
Dimm (X 0X) = (H)™ De -m -m BX) - (35.96) 
A parity transformation Y% + 7 — Y, X! > y' +7, x > —x flips the sign of m and multiplies by (—)*, 
Demim(x! +7,7 — Y, =x) = (—) Dem! -mX V, X) - (35.97) 


Particular examples of equation (35.122), illustrating how the signs work out, are 


£ 


Yim(8,9) = XÒ Dem'm(x!,0,x)¥em' (8,0 +x’ +X), (35.98a) 
m'=—£ 
£ 
Yim(9,9) = XÒ Dimm, V, —0)Yem (0 +4,9") - (35.98b) 
m'=—£ 


Since Ye,,(0,0) = y (2L + 1)/(4r) mo, the spherical harmonics Ym (0, ) themselves can be expressed in 
terms of Wigner rotation matrices, 


[2@+1 [2+1 
Yem (0, o) = ~g Pem (0, —9, —¢) = qq imol?s 0, 0) ’ (35.99) 


consistent with equation (35.84). 

The explicit form of the Wigner rotation matrix elements Dem'm(X', Y, x) is derived most elegantly from 
the Newman-Penrose components L,, L+ of the total angular momentum operator L, which are (Newman 
and Penrose, 1962; Goldberg et al., 1967; Geroch, Held, and Penrose, 1973) 


8 toa (a2 iD +i S| 
Ox’ ~ V2 f 


ri~ rts 
Ow sin Ox! sin Y Ox 
A similar set of equations holds for the total angular momentum operator L’ in the rotated (primed) frame, 


L, (35.100) 


with y' e x, 


(35.101) 


ð _ etx (- ð csp dO. 1 À 


L =—-i—, V 
“ay! == J2 Ow T Ox! TA strap Ox 


The Newman-Penrose components L+ are Hermitian conjugates with respect to integration over Euler angles, 


L=L_, (35.102) 
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meaning that for any differentiable functions f(x’, Y, x) and g(x, Y, x), 


27 p27 pT 27 p27 pr 
f | | 1o sinvavavar= | f f (E-P g sinwavayay , (35.103) 
0 0 0 0 0 0 


which follows from an integration by parts, the surface term vanishing when the integration is taken over 
the full ranges of the Euler angles. The Newman-Penrose components of the angular momentum operator 
form a Lie algebra, with commutators 


[L4,L-J=L,, [Lz La] = +4 . (35.104) 


It follows from the commutation rules (35.137) that the angular momentum operators L4 raise and lower 


by one unit the z-component L, of the angular momentum, and similarly the angular momentum operators 
L’, raise and lower by one unit the z-component L’, of the angular momentum, 


Em)(LF m+ 1) 


Dem! ,—(m41) (xp, X) oy (35.105a) 


ed 
Le Dyn al Wp, x) = rE 


Em (LF m +1) 


Lo 
L’ Dem, -m(X', Wp, x) = y! 


The squared total angular momentum operator is 


De m't1, -mX Y, X) . (35.105b) 


| = Seed ames om LL} +I 
Gi PECHI D a i : (35.106) 


The explicit form of the squared total angular momentum operator is 


2 1 ð 1 


o 
D= sin + L? — 2 cos YLL; +L?) . 35.107 
sin Y OW Yop sin? Y ( 4 ) ( ) 
The Wigner rotation matrix elements Dem’m(X', Y, X) are simultaneous eigenfunctions of the total squared 
angular momentum operator L? and of the operators L, = —i0/0y’, and L, = —i0/0x with eigenvalues 
respectively L(+ 1), —m’, and —m, 
L?’ DimmlX, Y, x) = ee te 1) DemmlX, Y, x) ’ (35.108a) 
EL Dimm (X, Y, x) = =m Dimm (xX, Y, x) ’ (35.108b) 
Lz Demm(X', Y, X) = =M Demrm (X's YX) - (35.108c) 
The polar part demnm(w) satisfies 
2 1 ðn ð 1 12 / 2 
L? dimm (Y) = sinyy— 4 (m? — 2m'm cosy +m?) demm (h) = UE + Ldemm(w) - 


sin ý Ow ðY sin? yb (35.109) 


The Wigner rotation matrices are orthogonal with respect to integration over Euler angles, 


2n p27 pr 2 
* ś 87 
| | Dimm (X: V, X)Demn(x’, Y, x) sin Y dydx'dx = ——6ebm'mOnin - (35.110) 
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The functions Demn(X', Y, x) and demn(w) satisfy many recurrence relations. A set of 4 building-block 


recurrences connecting Demn to Dezi mti, ne, 18 


D 


2.2 Dinan = (35.111) 


1 
2 


1 
W+1 (va Ve = pm)(£ E qn)De— i m+g,n+s ak V(e +1 +pm)(£ +I I) De4 4. m+8.n+4) , 


with p = +1 and q = +1. Equation (35.144) remains true with D replaced by d everywhere. Numerically the 
most useful recurrence relation, stable for increasing £, is, a consequence of equation (35.144), 


mn 
Kée+1,mn De+i,mn a (2¢ + 1) cos NG a 5| Denn Kéemn De-iymn ; (35.112) 
with 
(2 — m?)( 02 — n2 


starting from Demn(X', Y, X) = eX demn(w)e—*”™ with m or n equal to £, and 


(—) "deem = deme = a cos’*™ (5) sin’ ™ (5) (35.114) 


Another useful recurrence is 


a 
bia en Deyimn = (22+ 1) bov 4 an Demn + (l+ 1)kemn Ded ees (35.115) 


Again, equations (35.145) and (35.148) remain true with D replaced by d everywhere. The rotation matrices 
Demn for m = n = 0 reduce to Legendre polynomials, 


Deoo(x', V, X) = deoo (Y) = Pe(cos y) , (35.116) 


and those for n = 0 are proportional to associated Legendre polynomials, 


DimolX, p: X) = demolp)e X = C PP (cosp)e™™X (35.117) 


For general m'm, the rotation matrices Dem’m are proportional to Jacobi polynomials, 
Demm( X, p, X) = demm (pe xt x) (35.118) 


= (¢ 7 m)!(£ T m)! (m=m',m+m') m+m’ Y + m—m’ Y —i(mx+m'x’) 
= i m EE mil Pim (cos w) cos > } sin z) : 


The analysis of polarization in §35.10 involves resolving a rotation from p’ to p into the product of a pair 
of rotations with respect to a frame in which the z-axis lies along k. A rotation by angle 7 in the j/—p plane 
is equivalent to a rotation by Euler angles —y’, —6’, —d’ from the p’ frame into the k frame, followed by 
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a rotation by Euler angles ¢, 0, x from the k frame into the p frame. The various angles are illustrated in 
Figure 35.2. The equivalence implies the addition theorem 


g 


demm (Y) = 5 Denm (9, 0, x) Dem (=-x’, —6, -¢') 


n=—e 
£ 
= X. Denm($,9;xX)Dinm' (0'0, X') - (35.119) 
n=—£l 


35.12.2 Wigner rotation matrix 


The full 3-dimensional rotation group is the orthogonal group O(3), or, when extended to objects of half- 
integral spin, its covering group SU(2). The eigenfunctions of O(3) or SU(2) are the elements Dem/m of the 
Wigner rotation matrix. 

The Wigner rotation matrix Dem'm(X', Y, x) is defined to be the matrix element between harmonics 


Yem (N) in one frame and harmonics Y;;,,,(n’) in a frame rotated by Euler angles x’, w, X, 


bee Demm X, Y, X) = fiw (n')Yom(n) do . (35.120) 


The quantum numbers £, m’, and m must be either all integral or all half-integral, and £ must exceed both 
|m'| and |m], 


l> |m'|,|m] . (35.121) 


Equivalently, the spherical harmonics in the unrotated and rotated frames are related by 


£ £ 
Yom! (n’) = 5 Domain X, V, Xx) Yim (n) > Yim(n)= 5 DimmlX, Y, X) Yen (n’) : (35.122) 
m=—£ m=—£ 

Notice that spherical harmonics rotate into linear combinations of harmonics of the same harmonic number 
£, which is true because rotation leaves the total angular momentum L? unchanged. The Euler angles x’, 
wv, X in equation (35.122) correspond to a right-handed rotation of the unit vector n by angle x about the 
z-axis, followed by a right-handed rotation by angle ~ about the y-axis, followed by a right-handed rotation 

by angle x’ about the z’-axis, 


nl, cosy’ —siny’ 0 cosy 0 —siny cosy —siny 0 Ny 
ny |= | snx cosy’ 0 0 1 0 sinx cosy 0 Ny . (35.123) 
n, 0 0 1 siny 0 cosy 0 0 1 Nz 


The generator of an infinitesimal rotation about an axis n is —in - L, and the operator corresponding to a 
finite rotation by angle x about direction n is exp(—ix n- L). Thus the operator D(x’, Y, X) that generates 
a rotation by the 3 Euler angles is 


DO nipo =a ee (35.124) 
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The spherical harmonic components of the rotation operator are correspondingly (no sum over m’, m) 
DemmlX Oo =< X deimml(p)e X . (35.125) 


The matrix demm(w) is the polar part of the full rotation matrix Dem'm(X', Y, X). The polar rotation matrix 
dém'm(w) is a real matrix, orthogonal with respect to m’m, with matrix inverse 


demm (Y) t = demm (— 4Y) = demm’ (a) = de-mi mY) = (=)= demm (Y) x (35.126) 
A parity transformation Y% — m — 7 flips the sign of one of the indices m’ or m and multiplies by (—)‘~™ or 


(<j 


3 


demmhT — p) = (~) de, -m m (Y) = (—) dem, —m (WP) - (35.127) 
The matrix inverse of the Wigner rotation matrix is its Hermitian conjugate, its complex conjugate transpose, 
Demin (X!, p, X)? = Dimm( =x, —4, =X) = Dimm (XVX) - (35.128) 
Complex conjugation flips the signs of m’ and m, and multiplies by (=) =m, 
Diraten C Bs X) = (=) De, mam (X's Ps X) - (35.129) 
A parity transformation > m — Y, x’ > xX! +7, x > —x flips the sign of m and multiplies by (—)*, 
Dimm X +7,7 — p, =x) = (—) Dem —m(X', 0X) - (35.130) 


Particular examples of equation (35.122), illustrating how the signs work out, are 


£ 


Yim(0,9) = XO Dimm(X,0, X)Yem (0,6 +x’ +x) ; (35.131a) 
m'=—£ 
£ 
Yim(8,9) = XÒ Dimm, V, -9)Yem (0 +4,9") . (35.131b) 
m'=—£ 


Since Yem(0,0) = y (2L + 1)/(4r) mo, the spherical harmonics Yem (0, ġ) themselves can be expressed in 
terms of Wigner rotation matrices, 


[2+1 ELT 2. 
Yom (9, Q) E -jy Deom(0, —9, —ġ) = qq Démol, 9, 0) , (35.132) 


consistent with equation (35.84). 

The explicit form of the Wigner rotation matrix elements Dem'm(X', Y, X) is derived most elegantly from 
the Newman-Penrose components L,, L+ of the total angular momentum operator L, which are (Newman 
and Penrose, 1962; Goldberg et al., 1967; Geroch, Held, and Penrose, 1973) 


al __ tix («2 _, 1 4 | cosy >.) 
Oy’ j V2 ` 


(35.133) 


Ow | “sing Ox! ET Ox 
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A similar set of equations holds for the total angular momentum operator L’ in the rotated (primed) frame, 
with xy e x, 


g was L a 8 4 cos a) F 1 ð 
~ Ox! ? = V2 Ow = sinw Oy’ sinyðxj ` 


The Newman-Penrose components L+ are Hermitian conjugates with respect to integration over Euler angles, 


(35.134) 


Li =b; (35.135) 


meaning that for any differentiable functions f(x’, Y, x) and g(x’,¥v,x), 


27 p27 pr 27 p27 pr 
f | | +E sinvdvavay= | f f T-P g sinvavayay , (35.136) 
0 JO JO 0 JO JO 


which follows from an integration by parts, the surface term vanishing when the integration is taken over 
the full ranges of the Euler angles. The Newman-Penrose components of the angular momentum operator 
form a Lie algebra, with commutators 


[L4,L-]=L,, [Lz La] = +L . (35.137) 


It follows from the commutation rules (35.137) that the angular momentum operators L+ raise and lower 
by one unit the z-component L, of the angular momentum, and similarly the angular momentum operators 


L’, raise and lower by one unit the z-component L’, of the angular momentum, 


t m)(€=m+1) 


Dem’ ,—(m41) (xp, x) , (35.138a) 


L 
L4 Dem oA, Yp, x) = j 


tm’) (l= m +1) 


£4 
LE Dem, -m(X', Y, x) = ji 


The squared total angular momentum operator is 


De mti,-m(X', 0, X) è (35.138b) 


L =L, L_+L_L} +I? 
SL? =L L +L L +L. (35.139) 


The explicit form of the squared total angular momentum operator is 


3 1 0 a 1 


=- sin + L? — 2 cos YLL; + L2) . 35.140 

sin Y OW Ya sin? Y ( Y ) ( ) 

The Wigner rotation matrix elements Dem'm(X', Y, X) are simultaneous eigenfunctions of the total squared 

angular momentum operator L? and of the operators L, = —i0/0y', and L, = —i0/0x with eigenvalues 
respectively €(€+ 1), —m’, and =m, 

L? Dimm (X, 1X) = LUE F 1) Dimm (X, HX) » (35.141a) 

Li Dem'm(xX’, 0, X) = =m’ Demm\(x',¥,X) $ (35.141b) 


L, Demim(X’ Y, xX) St DimmlX, Y, x) i (35.141¢) 
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The polar part demm(w) satisfies 


L? demm (Y) = a 5 sinw } = M (m? 2mm cos w + m?) demm (a) = ee + 1)demm (Y) ‘ 
(35.142) 
The Wigner rotation matrices are orthogonal with respect to integration over Euler angles, 
id bad bi 81? 
f J Dermat (X, Y, X)Demn (X, Y, x) siny debdy'dx = zpr Serebm'monin - (35.143) 
o Jo Jo +1 


The functions Demn(X', Y, X) and démn(w) satisfy many recurrence relations. A set of 4 building-block 


recurrences connecting Demn to Do Limit ÍS 


Di 2,4 Demn = (35.144) 


1 
2+1 (pa vŒ = pm)(£ aa qn)De— i m4+B n+ ae VE +1 + pm) (£ +1+ I) De mB nth) ; 


with p = +1 and q = +1. Equation (35.144) remains true with D replaced by d everywhere. Numerically the 
most useful recurrence relation, stable for increasing £, is, a consequence of equation (35.144), 


Demn Kéemn De—-1,mn ; (35.145) 


mn 
Ke+1,mn De+1,mn = (20+ 1) os e+ 5| 


with 


(2 — m?2)(02 — n2 


starting from Demn(X', Y, X) = eo” demn(w)e7*""* with m or n equal to 4, and 


(—) "deem = deme = a cos’*™ (5) ein" (5) (35.147) 


Another useful recurrence is 


ð 
ERE mn Detizmn = (22 + 1) bov + = Demn + (£ + L)Kemn De-iymn - (35.148) 


Again, equations (35.145) and (35.148) remain true with D replaced by d everywhere. The rotation matrices 
Demy for m = n = 0 reduce to Legendre polynomials, 


Deoo(x’, Y, X) = deoo (Y) = Pe(cos y) , (35.149) 


and those for n = 0 are proportional to associated Legendre polynomials, 


Demo(X, Y, X) = demol) "X = ee Pr (cosyp)e™™X (35.150) 
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For general m'm, the rotation matrices Demm are proportional to Jacobi polynomials, 
Dimm (X, Ps X) = demm (pe tr’ +) (35.151) 


= (€—m)!(€+ m)! (m—m',m+m') mtm’ Y\ gaat —i(m'x'+mx) 
= i Eee N E Pin (cos w) cos > } sin 5} e . 


The analysis of polarization in §35.10 involves resolving a rotation from p’ to p into the product of a pair 
of rotations with respect to a frame in which the z-axis lies along k. A rotation by angle w in the p’—p plane 
is equivalent to a rotation by Euler angles —’, —6’, —¢' from the p’ frame into the k frame, followed by 
a rotation by Euler angles ¢, 0, x from the k frame into the p frame. The various angles are illustrated in 
Figure 35.2. The equivalence implies the addition theorem FIX: WRONG? 


£ 


demm (w) = 5 Denm (9, 0, X)Demmn(=X', —6, -¢') 


n=—k 
e 
= Dire OO X)Dinm (0X) , (35.152) 
n=—k 
in which 
cos Y% = cos 0 cos &’ + cos(¢ — ¢’) sin 8 sin 0’ . (35.153) 


Rotation by Euler angles ¢,6, x followed by a rotation by Euler angles ¢’, 6’, x’, is equivalent to a rotation 
by Euler angles ®,0,X 


£ 
5 Dinm’ (¢', O x!) Demn(¢, 0, x) = Dimm (®, O, X) ’ (35.154) 


n=—2 


where. 


35.12.3 Spin-weighted spherical Bessel functions 


Spin-weighted spherical Bessel functions jenms(y), with £, n > max(|ml, |s|), are defined by equation (36.8). 
The defining equation (36.8) along with the orthogonality relations of the Wigner matrices equation (35.143), 
imply that 

sin 6 d0 


jenms(y) = io” f e908 O dims (O)dnms (0) 5 (35.155) 


Equation (35.155) implies that the spin spherical Bessel functions jenms are Symmetric or antisymmetric in 
their first two indices Zn as their difference £ — n is even or odd, 


Jenms(Y) — (=) nems (y) . (35.156) 
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Equation (35.155) also implies that jenms are symmetric in their last two indices ms, because dems(0) = 
desm(—0), equation (35.126), 

Jenms = Jtnsm + (35.157) 
Equation (35.130) implies that a parity flip transforms dyn (7 — 0) = (—)*dem,—s(0); since a parity flip also 


flips the sign of cos @, the net result is that complex conjugation of jenms given by equation (35.155) flips the 
sign of m or s, 


Jinms = jén,—m,s = Jinem- . (35.158) 


with p = +1 and q = 1 to both sides of the defining 
rence (35.144), 


Application of the operator (—i)20+2-9) Dy 
equation (36.8) for jenms implies, from the recu 


2 
12> 


Z ovis 


1 . . . 
2n+1 vie +1 + pm)(n +14 qs) jeri nth mtg, +g — PIV (n — pm)(n — qs) jee} n-1,m+g,s+8 
1 . . 4 
TAF? [ve +1+pm)(€+14 98) jenms — ipay (£+ 1 — pm) (£+ 1 — qs) je+1,nms| - (35.159) 


For m = n and p = 1, the recurrence (35.159) simplifies to 


n+l+qs. 
“m41 Jeph nth ntist+s 


~ E [ve F1+n)(E+1+qs) jenns — ig (E+ 1—n)(E+1— q8) je 1,nns] (35.160) 


From the recurrence (35.160) it can be shown by induction that jenm,+s(y) with m = n and integral ¢ > n > 
s>Ois 


2 (2n)! (C+n)'(€—s)! 1 a N 
Jenn ts(9) = TE (2y)” (2 i) [y°se(y)] - (35.161) 


The jenms(y) with m = n satisfy the recurrence 


K ns. 1 is Rens . 
Pe ins inns = (26+ 1) | E je amns 5 (35.162) 


tnt y T| ee p= 
with Kens defined by equation (35.146). Applying 0/0y to either side of the defining relation (36.8), and 
using the recurrence relation (35.145), implies the recurrence 
o ims . ; 
) | H | Jenms + Knms Jl,n—1,ms , (35.163) 
Oy ) 


n(n+1 
which yields jenms(y) in general. The recurrence (35.163) of jenms with respect to n, along with the symme- 
try (35.156) of jenms in £n, implies a similar recurrence of jenms with respect to £, 


Kn+1,ms Jen+1,ms — (2n 


linms + ms Jé—1,nms - 35.164 
Oy T e(l + 5| Je Ke Je-1, ( ) 


; o ims 
Ke+1,ms Je+1,nms = — (22 + 1) | 
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Polarization of the Cosmic Microwave 
Background 


36.1 Radiative transfer of the polarized CMB 


The Boltzmann, or radiative transfer, equation for unpolarized photons was given previously by equa- 
tion (34.1). For the polarized photon distribution, the radiative transfer equations are 


(Z - tu +) (O40 +B-W)=1-75, (36.1a) 
n 
L a= aias (36.1b) 
an RL T g = -T?29. $ 


The J in the unpolarized radiative transfer equation (36.1a) is the ISW contribution, a sum of harmonics 


I(n, k, p) = ý + & +Ê- Ww + OP hap = 3 ee a ™ Tom( (n, k)Dnmo(¢, 0) , (36.2) 
n=0 m=-n 
with 
Ig =U +O, (36.3a) 
ha = We, (36.3b) 
Into = /2has - (36.3c) 


The ¿S in equations (36.1) are Thomson-scattering source terms, 


1 
S(n,k, p) =Y +p: W +p: vs +00+3 Si i)?+ (am + V6 E2m) Damo , (36.4a) 
m=—2 
S(n,k, P, X = 5 > XG ere a 2m T V6 Eom) Dome . (36.4b) 
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The harmonic components sSnm of sS defined by 


Smkp) = YO YO (DH Sanm, B)Dame($,9, x) (36.5) 


n=|s| m=—-n 


are, generalizing equations (34.4), 


Soo = Ovo + Ww ; (36.6a 
Sio = Vþb ; (36.6b 
Si, 1 = Vb, + W. ; (36.6c 
1 
Som = z (O2m + V6 Emm) (-2<m<2), (36.6d 
3 
252m = EON + V6 Emm) (-2<m<2). (36.6e 


The solution of the radiative transfer equations (36.1) is, generalizing the unpolarized solution (34.6), 


no z 
O(m, k, P) + Y(n, k) + P- W (m, k) = p [e77 I (n, k) + g(n) 5(1, k)] e HO) dn , (36.7a) 
0 
no k 
29 (m0, k, p, X) 7 g(n) 25(n, k) e70% dn , (36.7b) 
0 


where g(7) is the visibility function, equation (34.7). 


36.2 Harmonics of the polarized CMB photon distribution 


The spherical harmonics of the solution (36.7) can be found, as previously, by expanding the exponential 
e ‘¥# in spherical Bessel functions, equation (34.10). Spin-weighted spherical Bessel functions jenms(y) with 
£,n > max(|m], |s|) can be defined by a generalization of equation (34.11), 

(=i) *Damolb, 0, x). = YO (mi) 4(20 + 1)Dems(,0, X) Jenmaly)- (36.8) 


é=max(|m|,|s|) 


The spin index is dropped for brevity from the spin 0 modified Bessel functions, jenmo(y) = jenm(y). Prop- 
erties of the spin-weighted spherical Bessel functions are addressed in Appendix 35.12.3. The spin-weighted 
spherical Bessel functions are symmetric or antisymmetric in their first two indices Zn, equation (35.156), and 
symmetric in their last two indices ms, equation (35.157), and flipping the sign of either m or s tranforms 
them to their complex conjugates, equation (35.158), 


jn 


Jenms = (- JInéms ’ Jenms = Jensm j Loe = Jen,—m,s = Jen,m,—s : (36.9) 


In particular, the spin zero functions jenm are real, and all scalar (m = 0) components jenos are real. The 
real (electric) and imaginary (magnetic) parts of jenms are conveniently denoted by the real functions ,€¢nm 
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and ¿benm defined by 


Jinm,£s = s€lnm E 4 sPenm : (36.10) 


The spin zero magnetic part vanishes, oßenm = 0. The only other spin relevant is s = +2, so the spin index 
is dropped for brevity on the spin two electric and magnetic components, 


Jénm,£2 = €lnm E i Benm ` (36.11) 


Under m — —m, the electric components are unchanged, while the magnetic components change sign, 


Jeén,—m = Jenm ; Eln, =m = Elnm 3 Pinem — —Benm : (36.12) 


In all, the spin spherical Bessel functions of relevance are, from equation (35.161) for jenms with n = m, 
and the recurrence (35.163) for n > m, 


i . F dje F 1 d2 , 
Jeoo = Je , Jeo = dy ; Jezo = 3 (1 + saa) Je, (36.13a 
; LL +1) j . 3L(L++ 1) dy 
fio (+1) je , NR (€+1) dGe/y) . (36.13b 
2 y 2 dy 
è 3(£ + 2)! Je 
= = .1 
Je22 BEDI y2 (36.13c 
[3(€+2)! je (€—1)(€+2) 1 dly je) 1 (e 5. 
hl ee ee oo = == |= 1l 36.13d 
€220 BU) y2’ €221 7 y dy ’ €422 dy? \ dy? (yje), 
€—1)(€4+2) j 1 d(y?j 
Be =0, Bei = ( i ) Je Be2 = -53 (y"Je) (36.13e 
2 y 2y? dy 


Expanding the solution (36.7) in spherical harmonics using equation (36.8) yields the harmonics of the 
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CMB photon distribution today including polarization, generalizing equation (34.17), 


O(no, k) + ĉo Y(n, k) = ie ee | H(no. k) + (n0, k)| jeo [k(n — no)] 


2 


+g(n) 2 Sno (n, k) jeno [k(n — no)] dn , (36.14a) 
Əri (00s) + bön Wa lno: k) = f° =W lok) jen [k0 = mo) 
+ g(n) 3 Sn, +1(0, k) jeni [k(n — no)] dn , (36.14b) 
Ənsar, k) = f e573 has Cok) ena [0 = mo) 
+ 9(1) S2,+2(1, k) je22 [k(n — no)] dn , (36.14¢) 
Eim (ne) = | aSk) cram [Rl mo] dn (25m2), (36.14) 
Bim(amsk) = f° a(n) 28am k) Beam (lnm) dy (25m2). (86.146) 


The Thomson-scattering source terms sSnm are given by equations (36.6), and the spin spherical Bessel 
functions by equations (36.13a). 


Exercise 36.1. Neutrino harmonics including vectors and tensors. Equation (34.46) gave the solution 
to the radiative transfer equation for scalar (m = 0) fluctuations of (massless) neutrinos. Generalize this to 
include vector (m = +1) and tensor (m = +2) neutrino fluctuations. 

Solution. The solution is similar to that (36.14a)—(36.14c) for unpolarized photons, but without the Thom- 
son scattering terms: 


Neln, k) + 520 (n, k) = | "Dia, k) + (ay, Bde (Ca! — n)] a 


+ [No(0,k) + U(0,k)] je(—kn) , (36.15a) 
Ne =+1(1, k) + 381 W= (1n, k) = [ws (n, k) jer [k(n — )] adn! 
+ We (0, &)je11(—kn) 5 (36.15b) 


N, 


è 


scale) = fy} has) je O= n) de (36.150) 
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As in equation (34.46), the time 7, of neutrino decoupling has been replaced by zero, and the optical depth 
factor omitted, since the neutrino decoupling scale is so much smaller than cosmological scales. 


36.2.1 Harmonics of the polarized CMB with respect to observed photon directions 


As with the unpolarized power spectrum, §34.2.1, the observed direction of ñ of a photon from the CMB 
is opposite to the photon’s direction of motion, A = —p. Moreover the right-handed direction around p 
becomes a left-handed direction about ñ, so the spin angle x also flips in sign. Thus harmonics with respect 
to the observed direction Ĥ are related to those relative to the photon direction p by ,0°>S(n, k, p, x) = 
O(n, k, —p, —x). The reversal of p and x is equivalent to a parity flip, which changes the spin s temperature 
multipoles by ,O9>8(1,k) = (—)**8_,Qm(no, k). Equivalently, generalizing equation (34.18), 


m 


Om (No, k) = (—)*Oem(o,*) Eom (no, k) = (—)'Eem(n0,k) , Bem (no, k) = (—)H Bem(10, k) - 
(36.16) 
Equivalently, as in the unpolarized equation (34.17), multipoles with respect to the observed direction ù to 
the CMB are obtained from equations (36.14) by flipping the sign of the arguments of the spin spherical 
Bessel functions jenms and simultaneously flipping the sign of source terms sSnm with odd n, namely Sim, 


k(n = no) > k(n =n), Sim + —Sim - (36.17) 


The sign flips do not affect power spectra, which involve products of fluctuations with the same £ and parity. 


36.3 Harmonics of the polarized CMB in real space 


The real-space polarized temperature fluctuation ,O(7, x, À, X) at time 7 and comoving position æ in observed 
direction ù on the sky is related to the Fourier-space polarized temperature fluctuation O(n, k, à, x) by, 
generalizing the unpolarized expression (34.28), 


dk 
(27)? 


sO(n, £, ù, x) = ex s0(n, k, ñ, x) (36.18) 
Astronomers observe the temperature fluctuation ,O(70, £o, À, X) now, at time no, and here, at position £o. 
Without loss of generality, our position can be taken to be at the origin, £o = 0, in which case the phase 
factor is unity, e~*'*° = 1, and can be omitted, 


dk 
(27)? ? 


sO(N0, Lo, Ê, X) = f olm, kiñ (36.19) 


which generalizes the unpolarized expression (34.29). 
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The spherical harmonic expansion of the observed real-space temperature fluctuation today is, with con- 
ventional normalization of harmonics Ofm, 


ee) 4 
sO(no, Lo, Ê, X) = 5 5 sOem(No, Lo) —sV om (A,X) 3 (36.20) 


L=|s| m=—£ 


which generalizes equation (34.30). The reason for the expansion with respect to -sY as opposed to sYrm 
is that, as already remarked after equation (35.37), the coefficient ,O,2,, then has spin weight s and m as 
opposed to s and —m. The spherical harmonic expansion (35.37) of the Fourier-space temperature fluctuation 
may be written 


oo min(£,2) l 
` ` 


sO (10, k, À, x) = 5 (—i) tms V An (2¢ F 1) sem (n, k) -sDinmlê, k) -sY in (nh, x) ’ 


£=|s| m=— min(€,2) n=—£ 


(36.21) 
where ¿Demm (A,A) is the matrix that rotates spin harmonics Yem, defined by, analogously to the defini- 
tion (35.120) of Wigner rotation matrices Demm (®', ®), 


600 sDemm (A, À) = / sYöm (A) sYem (À) do . (36.22) 


It is not necessary to know an explicit form for the spin rotation matrices ,Dgmm, because observable power 
spectra are rotation invariant, and do not depend on the form of ,Demm. Whereas the original harmonics 
sQem(k) are with respect to a frame in which the z-axis is along the wavevector k, the rotated harmonics 
Ùm sOemlk) eer seen k) in equation (36.21) are with respect to a frame in which the z-axis is along a 
direction £ fixed in space. 

From equations (36.19)—(36.21) it follows that the real-space harmonics are 


min(£,2) 3 
ges Stee OE 
‘Onl £0) = VnF So (9t / em (0:8) -Dinn (2:8) Bo - (36.23) 


m=— min(£,2) 


The factors of \/47(2@+ 1)(—i)'t™~® arise because of the different choices of normalization of the har- 
monics (as is the standard cosmological convention) in the harmonic expansions (35.37) and (36.20) of the 
temperature fluctuation in Fourier and real space. 

Rotating the Fourier-space harmonics ¿Om (1, k) from the k frame into the 2 frame leaves their parity un- 
changed, so the real-space harmonics inherit their parity from Fourier space. Resolved into parity eigenstates, 
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the real-space harmonics (36.23) are 


min(£,2) 
~ ere s pe ee Ëk 
Oml z0) = VAI Y (4 / em (1: k) Dinm 28) Tyas» (36.24a 
pe 2) 
bs @k 
Even( (no, £o) = vy dn (24 + 1) Da ea 2 f Eem (n, k) -s Dnm (ê, k) (27)3 , (36.24b 
m=—2 
x, dk 
Ben(no, £o) = /4m(20 + 1) Pe i) tme- a Bon n, k) -sDýnm(2, k) nF (36.24c 
m=—2 


36.4 Polarized CMB power spectra 


36.4.1 Polarized CMB power spectra in Fourier space 


Power spectra C% Xn n,k) with X’ and X running over any of ©, E, and B are defined by, analogously to 
the power spectrum Ce(n, k) of unpolarized temperature multipoles, equation (34.26), 


min(£,2) 


TEE an)?p(k' + k) CÈ *(mk)= $) (Xbin (08) Xim(n,k)) . (36.25) 


m=— min(£,2) 


The reality conditions (35.46) imply that the power spectra are real-valued, and symmetric in X'X, cxx = 
co, Strictly, on the right hand side of equation (36.25) the unpolarized monopole O99 should be replaced 
by the redshifted monopole Ooo + Y, and the unpolarized dipole 0,4; should be replaced by the Doppler- 
shifted dipole 0; 41 + iW. , in accordance with equations (36.14), but these refinements are omitted here 
to avoid cluttering the equation. 

Polarized CMB transfer functions TX, (n, k) for any of X = ©, E, or B are defined by, generalizing 
equation (34.20) (the contributions Y and iW, to the unpolarized monopole and dipole are again omitted 
for brevity), 


Xem(n k) 
i ¢(k) 
where ¢(k) is the primordial curvature fluctuation. In terms of the transfer functions (36.26) and the pri- 
mordial curvature power spectrum P¢, equation (30.132), the power spectrum CXX (n, k) is 


OF * (n, = 4n Sr Pi k) Te, (n, k) P¢(k) $ (36.27) 


m=—2 
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36.4.2 Conditions on polarized CMB power spectra from parity symmetry 


The Universe at large is consistent with being statistically homogeneous and isotropic, and it is reasonable to 
expect that the statistical properties would similarly be parity symmetric, unchanged under spatial inversion. 
The prediction of parity symmetry is, like homogeneity and isotropy, testable observationally. The temper- 
ature and electric fluctuations Opm and Ey, have the same (-)¢ parity under spatial inversion, while the 
magnetic fluctuation Bem has the opposite (—)**! parity. The assumption of parity symmetry then implies 
that cross power spectra between fluctuations of opposite parity should vanish, CP? = CF? = 0, since these 
power spectra change sign under parity inversion. Parity symmetry predicts that the non-vanishing power 
spectra are 


CF. OC . Ds O (36.28) 


36.4.3 Polarized CMB power spectra in real space 


CMB power spectra CXX (no) on the sky today with X’ and X any of O, E, and B are defined such that, 
generalizing equation (34.33), 


Srem mOFE * (m0) = (Xin (M0, £0) Xem (N0, £0) . (36.29) 


Once again, the redshift contribution W to the unpolarized monopole Oo9, and the Doppler-shift contribution 
Wm to the dipole ©ım on the right hand side of equation (36.29) have been omitted for brevity. The 
monopole and dipole are indistinguishable from a rescaling of the mean temperature and from a change in 
the motion of the observer, so cannot be measured by an observer confined to position ao. 

From the expressions (36.24) for the real-space harmonics in terms of Fourier-space harmonics, together 
with the power spectra (36.25) of the Fourier-space harmonics, it follows that the power spectra Cc; (no) 
of real-space harmonics of the CMB today are, generalizing equation (34.34), 


1 1 4rk?dk 
Ce * (no) = / Ce * (no, k) One (36.30) 


The CMB power spectra Ce (no) inherit from oxx (no, k) the properties of being real-valued and sym- 
metric in X’X. 

In terms of the polarized CMB transfer functions TX, defined by equation (36.26) and the primordial 
curvature power spectrum Pg, the power spectra CX (no) are, from equation (36.27), 


min(¢,2) 
OFX (mo) = 47 > Tim (0, k) Tym (to, k) Pe(k) 


m=— min(£,2) 


4rk?dk 
(27)° 


(36.31) 


Concept question 36.2. Scalar, vector, tensor power spectra? Can power spectra of scalar, vector, 
and tensor modes be distinguished observationally? Answer. No, with an exception. Scalar, vector, and 
tensor modes are characterized by their transformation properties under rotation about the wavevector k of 
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the perturbation. An observed temperature fluctuation in real space is a superposition of fluctuations with 
many wavevectors k, and thereby becomes a mixture of scalar, vector, and tensor modes. The exception is that 
the scalar magnetic fluctuation Beg vanishes identically, so the magnetic power spectrum OPR measures only 
vector and tensor modes. Mathematically, the real-space harmonics (36.24) of the temperature fluctuation 
are sums over scalar, vector, and tensor modes, |m| = 0,1, 2. In Fourier space, power spectra Cem (no, k) with 
definite m can be defined by equation (36.25) without summing over m. But in real space, the CMB power 
spectrum Ce(no), equation (36.31), is a sum over the scalar, vector, and tensor Fourier-space power spectra, 


min(£,2 


) 2 
, , 4rk“dk 
Ce * (m) = 5 [ck Xm, k) me (36.32) 
m=— min(£,2) 


Exercise 36.3. CMB polarized power spectrum. Generalize the CMB code you wrote in Exercise 34.1 
to include polarization. 
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Gravitational lensing of the Cosmic 
Microwave Background 


Galaxies along the line of sight slightly perturb the trajectories of photons emitted at the surface of last 
scattering (Zaldarriaga and Seljak, 1998, and references therein). The qualitative effect of this gravitational 
lensing effect is to tend to blur CMB fluctations at small scales. The gravitational lensing effect has been 
neglected in this book up to now on the grounds that its magnitude is proportional to a product 
dp Of 
dà ap 


of terms that were both linear in the photon Boltzmann equation (33.8), and therefore of the second order 


(37.1) 


of smallness. The reason the gravitational lensing effect is important despite being of second order is that it 
feeds B-mode polarization from E-mode polarization. At small angular scales, gravitational lensing proves 
to dominate the primordial B-mode signal expected from gravitational waves generated at inflation. Fortu- 
nately the lensing effect is small at large angular scales, leaving a window where a signal from primordial 
gravitational waves might be seen in the future. An upside of gravitational lensing is that, because it depends 
on the clustering of matter well after recombination, it resolves degeneracies in cosmological parameters that 
would be inferred from the unlensed CMB power spectrum at the surface of last scattering. 

The product of terms that was neglected in the photon Boltzmann equation (33.8) is dp/d\ - Of /Op, and 
these must now be restored. 
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SPINORS 


38 


The super geometric algebra 


The super geometric algebra generalizes the geometric algebra to include spinors, which are spin-4 
objects. 

For simplicity, this Chapter focuses on the super geometric algebra in 3 spatial dimensions. The gener- 
alization to arbitrarily many spatial dimensions is given as Exercise 38.3 at the end of the Chapter. The 
generalization of the super geometric algebra to Minkowski space, with a time dimension in addition to spa- 
tial dimensions, is presented in Chapter 39. The generalization to arbitrarily many space and time dimensions 
is given as Exercise 39.5. 


38.1 Spin basis vectors in 3D 


A systematic way to project tensors into spin components is to work in a spin basis. Start with an orthonor- 
mal triad {71,y2,-y3} (or {Yz,Yy,‘Yz} if you prefer). Choose a pair of basis vectors, in three dimensions 
conventionally taken to be the pair {71,72}, and from them form the spin basis vectors {7+,y_}, the 
complex combinations 


yp = Z +iy2) |, (38. 1a) 
y- = z- ir) (38.1b) 


This is the same trick used to define the spin components L+ of the angular momentum operator L in 
quantum mechanics. The metric of the spin triad {y,,-y_,y3} is 


Yab = Ya ` Yb = 


O- © 


1 0 
00]. (38.2) 
0 1 


Notice that the spin basis vectors {-y;,y—} are themselves null, y4 -Y+ = Y-Y- = 0, whereas their scalar 
product with each other is non-zero y+ : y- = 1. 
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38.2 Spin weight 
An object is defined to have spin weight s if it varies by 
et (38.3) 


under a right-handed rotation by angle 0 in the yı—y2 plane. In 3D, a right-handed rotation in the y1—y2 
plane is the same as a right-handed rotation about the 3-axis, and the spin weight is the projection of the 
spin along the 3-axis, the spin analogue of the projection L3 (or L+) of the angular momentum along the 
3-axis (or z-axis). Sometimes the term spin weight is abbreviated to spin, when there is no ambiguity. An 
object of spin weight s is unchanged by a rotation of 27/s in the 71-772 plane. An object of spin weight 0 is 
rotationally symmetric, unchanged by a rotation by any angle in the y1—y2 plane. 

Under a right-handed rotation by angle 0 in the -y;—‘y2 plane, the basis vectors Ya transform as (13.51) 


yı > cos 0 yı +sinh , 
y2 > sin 0 yı — cosh y2 , 
13 > Y. (38.4) 


It follows that the spin basis vectors y+ and -y_ transform under a right-handed rotation by angle 0 in the 


Vi-72 plane 


ye > eT yy. (38.5) 


The transformation (38.5) identifies the spin vectors y, and y— as having spin weight +1 and —1 respectively. 
The y3 vector has spin weight 0, since it is unchanged by a rotation in the 1-72 plane. 

The components of a tensor in a spin basis inherit their spin properties from that of the spin basis. The 
general rule is that the spin weight s of any tensor component is equal to the number of + covariant indices 
minus the number of — covariant indices: 


spin weight s = number of + minus — covariant indices | . (38.6) 


The spin properties of the components of a tensor are thus manifest when expressed in a spin basis. 


38.3 Pauli representation of spin basis vectors 


In the Pauli representation (13.112), the spin basis vectors y+ are represented by the real 2 x 2 Pauli matrices 


Y+ = 04 = Teli tion) = v3( 6 ae y= 0-5 Fa(o1 ion) = v3 ( $ ae (38.7) 


The basis vector -y3 is represented as usual by the real Pauli matrix 3, 


3 = 03 = ( i; ka ) . (38.8) 
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38.4 Basis spinors 


Introduce a dyad of basis spinors €a with the index a running over spin up ¢ and spin down | (the braces in 
equation (38.9) signify a set of spinors, not anticommutation), 


Ea = {Ep EL} |. (38.9) 


The basis spinors e+ and e} physically signify spin up and spin down eigenstates. A more conventional (Dirac) 
notation is 


e=), a=W). (38.10) 


It will be seen in §38.11 that the basis spinors €a are related to the 3D basis vectors Ya through a super 
geometric algebra that is essentially the square root of the geometric algebra. Elements of the geometric 
algebra act by pre-multiplication on the basis spinors €,. Under a rotation by rotor R, the basis spinors €, 
are defined to transform in the same way as rotors, 


R: ea > Rea. (38.11) 


In the Pauli representation (13.112) the basis spinors €a are the column vectors 


ar e= 2). (38.12) 


that are rotated by pre-multiplying by elements of the special unitary group SU(2). Rotations transform the 
basis spinors €a into linear combinations of each other. 

The rotor R corresponding to a right-handed rotation by angle 0 in the yı1—y2 plane is e~9/?, equa- 
tion (13.106). In the Pauli representation (38.9), the action of 23 = I303 on the basis spinors is 13€; = ie; 
and 23€, = —ie,. Under a right-handed rotation by angle 6 in the 71-72 plane, the basis spinors €a therefore 
transform as 


~i0/2 


ere e, ee" ey. (38.13) 


The behaviour (38.13), along with the definition (38.3) of spin, shows that the basis spinors e} and e} have 
respective spin weights +3 and —}. A rotation by 0 = 27 changes the sign of the basis spinors €a. A rotation 
by 4r is required to rotate the basis spinors back to their original values. 

Spinor tensors inherit their spin properties from those of the basis spinors. The rule (38.6) generalizes to 
the statement that the spin weight of a spinor tensor is 


spin weight s = 4 (number of f minus | covariant indices) |. (38.14) 


In any equality between vector and spinor tensors, the spin weights of the left and right hand sides must be 
equal. The rule (38.14) hold not only for column spinors €a, but also for row spinors €a, §38.7, and for inner 
and outer products of spinors, §§38.8 and 38.10. 
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38.5 Pauli spinor 


A Pauli spinor ¢ is a complex (with respect to i) linear combination of the basis spinors €a, 


p= peal. (38.15) 


Just as a multivector afya is a vector in the geometric algebra, so also y*e€, is a spinor in the super geometric 
algebra. 

By construction, a Pauli spinor transforms under a spatial rotation by rotor R like the basis spinors, 
equation (38.11), 


R: po Rọ. (38.16) 


A Pauli spinor ¢ is a spin-4 object, in the sense that a rotation by 27 changes the sign of the spinor, and a 
rotation by 47 is required to return the spinor to its original value. 


38.6 Spinor metric 


In a matrix representation, the tensor product of basis spinors €a and €, can be represented as the 2 x 2 
matrix Ea} 5 a matrix product of the column spinor €a with the row spinor el. In accordance with the 
transformation rule (38.11), the tensor product of basis spinors rotates as 


R: eael} > Reac R! . (38.17) 
Consider the spinor tensor ¢ with the defining property that for any rotor R 
eR' = Re. (38.18) 
The condition (38.18) implies that the spinor tensor € is invariant under rotations, 
R: e> ReR' = RRe =e. (38.19) 


The spinor tensor € is the spinor metric. Like the Euclidean metric, it is that tensor which remains invariant 
under rotations. 

Since a rotor is a linear combination of even elements 1 and Iga of the geometric algebra, and bivectors 
Izy, change sign under reversal, a necessary and sufficient condition for (38.18) is 


e(Isya)' =—Igyne fora=1,2,3. (38.20) 


In the Pauli representation (13.112), where ya = oq and Iz equals i times the unit matrix, the condi- 
tion (38.20) requires that € commutes with -y2, and anticommutes with yı and y3. The only basis element 
of the spacetime algebra with the required (anti)commutation properties is y2, so the spinor metric € must 
equal y2 up to a possible scalar normalization, 


E=iyn. (38.21) 
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In the Pauli representation (13.112), the spinor metric (38.21) is the antisymmetric matrix 


oe ( K ; ) . (38.22) 


The chosen normalization is such that € is real (with respect to i). The spinor metric £ is then orthogonal, 
and its square is minus the unit matrix, 


etae', efc, (38.23) 


Despite the equality of ¢ and i7y2 in the Pauli representation, £ is defined to transform as a spinor tensor 
under spatial rotations, not as an element of the geometric algebra. The components of the spinor metric 
matrix € constitute the spinor metric £ab, 


ElEEb = Eab - (38.24) 


Commuting the spinor metric ¢ through the orthonormal basis vectors Ya converts them to minus their 
transposes, 


YLE = ENa - (38.25) 


38.7 Row basis spinors 


It is convenient to use the symbol €a: with a trailing dot, symbolic of the trailing £, to denote the row spinor 


T 
EaE, 


elel. (38.26) 


Eq’ 


The motivation for the trailing dot notation is equation (38.30) below. The two row spinors (the braces in 
equation (38.27) signify a set of spinors, not anticommutation) 


Eq: = {epe} (38.27) 


provide a basis for row spinors. The spin weights of the row basis spinors are in accord with their covariant 


indices: €}: has spin weight +2 


5, while €,- has spin weight —. The row spinors €a: rotate as 


R: ea = ele > el Rie=eleR=e,:R. (38.28) 


Thus row spinors €a: transform like reverse rotors, just as column spinors €a transform like rotors. In the 
Pauli representation (13.112) the row basis spinors €a: are the row spinors 


e-=(01), e-=(-10). (38.29) 
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38.8 Inner products of basis spinors 


The product of the row spinor €a: with the column spinor ée, defines their inner product, or scalar product, 
which equals the spinor metric €a, in accordance with equation (38.24), 


Ea Eb = Ean | (38.30) 


Equation (38.30) motivates the trailing dot notation for the row spinor. The scalar product is antisymmetric, 


Ea Eh = — Ep Eq]. (38.31) 


In the Pauli representation, the non-zero components of the scalar product are explicitly, equation (38.22), 
Et: E =—E Ee =l. (38.32) 


The antisymmetry of the spinor scalar product contrasts with the symmetry of the usual vector scalar 
product. The scalar product (38.30) is a scalar, 


R: €a’ € > Ea: RRQ = €a 6p. (38.33) 


Thus the spinor metric £ap is invariant under rotations, just like the Euclidean metric ap- 


38.9 Lowering and raising spinor indices 


The antisymmetric spinor metric €a, is given in the Pauli representation by equation (38.24). The inverse 
metric £% is defined by ee), = 5%. The spinor metric and its inverse satisfy 


Gab = —Etg = —Ew =e. (38.34) 


Indices on a spinor tensor are lowered and raised by pre-multiplying by the metric ¢,, and its inverse ¢@?. 


The contravariant components e? of the column basis spinors, satisfying €? = e%ep, are 


ee =e, =e. (38.35) 


For example, ef = ee, = —e,. A spinor index is lowered or raised by pre-multiplying by the metric or its 
inverse: post-multiplying by the metric or its inverse yields a result of opposite sign, €? = ee, = —e,e?". 
The contravariant components €“ - of the row basis spinors satisfy the same relations (38.35) with a trailing 
dot appended on left and right hand sides. The scalar products of contravariant row with covariant column 


basis spinors form the unit matrix, 


a 


E -€& = —€p: g = On š (38.36) 
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38.9.1 Scalar products of Pauli spinors 


A general row spinor y- is a complex (with respect to i) linear combination of the row basis spinors 


y- =ple=p%e,-|. (38.37) 


It rotates as 


R: g>- R. (38.38) 
A row spinor y- transforms like a reverse rotor. 
The product of a row Pauli spinor y: = y%e,- with a column Pauli spinor x = y%€, forms a scalar, which 
may be written variously 


P: X= plex = prea: X’ En = Eabp X? = Y° Xa = Pa X? = -EpaX - (38.39) 


Notice that when the scalar product y- x is written in the contracted form y“xq, the first index is raised 
and the second is lowered. An additional minus sign appears if the first index is lowered and the second is 
raised. 
The components y® of a column spinor y can be projected out by pre-multiplying by the row basis spinor 
a 


€":, 


a 


ego pep =e = p°. (38.40) 
The components y“ of a row spinor y- can be projected out by post-multiplying by minus the column basis 
spinor €f, 
—y: = —pep -€e = dey? = yp" ‘ (38.41) 
If the coefficients y* and y? of Pauli spinors Y = y%e, and y = xte, are taken to be ordinary commuting 
complex numbers, then the Pauli scalar product is anticommuting 


p: Xx=-x: P]. (38.42) 


In quantum field theory spinor coefficients are sometimes taken to be anticommuting, in which case the 
scalar product would be commuting. A proof that Pauli spinors anticommute (so their coefficients must be 
ordinary commuting complex numbers) is given later, equation (38.73). 


38.10 Outer products of basis spinors 


A row spinor €a: multiplied by a column spinor €p yields their scalar product. In the opposite order, a column 
spinor €a multiplied by a row spinor €p- yields their outer product. The outer product €,€,- rotates like a 
multivector in the geometric algebra, 


R: €&€-= eels = Re,e, R': = Re,e,eR = Ree R. (38.43) 
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The trailing dot on the outer product €,€,- is symbolic of the trailing £, necessary to convert the spinor 
tensor Ea€, into an object that transforms like a multivector. 

The products of the 2 column basis spinors €a with the 2 row basis spinors €,- form 4 outer products. The 
3D geometric algebra has 8 basis elements, but the pseudoscalar [3 is a commuting imaginary which in the 
Pauli representation is just 7 times the unit matrix, so the 3D geometric algebra is equivalent to a complex 
algebra with 4 basis elements. The 4 outer products of basis spinors thus suffice to generate the complete 
complex 3D geometric algebra. In the Pauli representation (13.112), the 4 outer products of basis spinors 
map to elements of the 3D geometric algebra as follows. 

The antisymmetric outer products of spinors form a scalar singlet, 


[e e] = 1, (38.44) 


where the 1 on the right hand side denotes the unit element of the 3D geometric algebra, the 2 x 2 identity 

matrix. The trailing dot on the commutator indicates that the right partner of each product is a row 

spinor, |e}, €,]- = ejeje — EL ELE. The combination (38.44) is familiar from quantum mechanics as, modulo a 
1 


normalization factor, the spin-O singlet formed from a combination of two spin-5 particles, commonly written 


in Dirac notation 
ley, €t] = IN) -— IP) - (38.45) 


The spin weight of the singlet (38.44) is zero according to the rule (38.14), as it should be for a scalar. 
The symmetric outer products of spinors form a triplet, 


{en Et} g V274 , {en e} =N; {e e}: = -v27 . (38.46) 


The combinations (38.46) of basis spinors are, modulo normalization factors, familiar from quantum me- 
chanics as the three components of the spin-1 triplet formed from a combination of two spin- $ particles, 


{eres} =21NIN), enat =M, fere}=2WN)- (38.47) 


The spin weights of the triplet (38.46) are respectively +1, 0, —1 according to the rules (38.6) and (38.14). 
The spin weights of left and right hand sides match, as they should. 

The trace of the outer product of a pair of basis spinors gives their scalar product (note that the 1 on the 
right hand side of equation (38.44) is the unit matrix, whose trace is 2), 


Tr Ea Eb’ = Ep ` Ea = Eba - (38.48) 


The expansion of the 4 outer products €,€,- of spinors in terms of the basis elements y4 of the geometric 
algebra, and vice versa, define the matrix of coefficients yâ and its inverse 7%, 


EaEb = YYA, YA = PEE: - (38.49) 
The coefficients y4 and y% in the chiral representation are 


Vb =F E Y Ea, YEE yae. (38.50) 
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Exercise 38.1. Consistency of spinor and multivector scalar products. Confirm that the spinor and 
multivector scalar products are consistent. 

Solution. Multivector vectors are equivalent to outer products of Pauli spinors in accordance with equa- 
tions (38.46). For example, the scalar product of the multivectors y+ and -y_ is 


q- = 5 (+-+ Y-Y) 

-illen e} {eneh + {ene} - Ler, er} ) 

= — (erler -eje + ele: e+ et) 

ZETEL: + ELET: 

ley, €t]: 

; (38.51) 


= 


the fourth step of which invokes the spinor scalar product (38.32), and the last step of which is from the 
equivalence (38.44). The result agrees with the multivector scalar product (38.2). 


38.11 The 3D super geometric algebra 


The 3D super geometric algebra comprises 4 distinct species of objects: true scalars, column spinors, row 
spinors, and multivectors. In a matrix representation, they are complex (with respect to i) matrices with 
dimensions 1 x 1, 1 x 2, 2 x 1, and 2 x 2. The true scalars are just complex numbers. A column spinor ¢ is 
a complex linear combination of column basis spinors €a, 


y= pea , (38.52) 
while a row spinor y- is a complex linear combination of row basis spinors €a’, 

p= Yea: - (38.53) 
A multivector a is a complex linear combination of outer products of the column and row basis spinors, 

a =a” Eac: . (38.54) 


Linearity and the transformation law (38.43) imply that the algebra of sums and products of outer products 
of spinors is isomorphic to the geometric algebra. 

There are two distinct kinds of scalar in the super geometric algebra, true scalars that are just complex 
numbers, and multivector scalars that are proportional to the unit matrix in a matrix representation. See 
§39.6.2 for an explanation of this conundrum. 

As seen in §38.8 and §38.10, a column spinor y and a row spinor x: can be multiplied in either order, 
yielding an inner product which is a scalar, and an outer product which is a multivector. However, a column 
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spinor cannot be multiplied by a column spinor, and likewise a row spinor cannot be multiplied by a row 
spinor, as is manifestly true in a matrix representation. 

In applications to quantum field theory, rather than prohibiting certain kinds of multiplication, it is 
convenient instead to assert that prohibited multiplications simply yield a true scalar value of zero. Thus 


yx=0, y-x-=0, ya=0, ay =0. (38.55) 


This allows all objects in the super geometric algebra to be added and multiplied, regardless of their species. 
In general, a sequence of products of spinors yields a non-zero result provided that they alternate between 
column spinor and row spinor, 


yx: w or p xp. (38.56) 

Both product sequences are associative, 
ex P=(yx:)v=(x-y), (38.57a) 
exp = lp xy = paxe). (38.57b) 


A product of an even number of spinors yields a scalar or a multivector depending on whether the first spinor 
is a row or a column spinor. A product of an odd number of spinors yields a row spinor or a column spinor 
depending on whether the first spinor is a row or a column spinor. 

The scalar product and the associative law make it straightforward to simplify long sequences of products. 
Let a = a™e,€,- and b = b™e,€,- be two multivectors expressed as a sum of outer products of spinors. 
Their product is the multivector 


te 6, +b enegs = Eaa eb Eg: = €g0 by eg: . (38.58) 


ab = a° 
A sequence such as y- ay simplifies as 
P: AX = preg: a Epes ` X'Ed = p° Eaba Ecd X? = Paa Xc - (38.59) 
The trace, equation (38.48), of an outer product of spinors is a true scalar 
Tr xp: = X P Eba = -X P= PX, (38.60) 


the last step of which assumes that the coefficients y? and y? are ordinary commuting complex numbers, 
equation (38.42). 


38.12 Conjugate Pauli spinor 


The 3D super geometric algebra possesses a discrete transformation called conjugation. The conjugate Pauli 
spinor Ø is defined by equation (38.63). It has the defining properties that (a) its components are complex 
conjugates (with respect to i) of those of the parent spinor y, and (b) the conjugate spinor ¢ rotates in the 
same way as the spinor y. 
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The complex conjugate y* of a Pauli spinor y = ye, is defined to be the spinor with complex conjugate 
(with respect to i) coefficients, 


p= pe, - (38.61) 
In effect, the basis spinors €, are taken to be real, just as the basis vectors y+ and %3 in the spin basis 


are real, equations (38.7). Since the spinor ọ rotates under a rotor R as y — Ry, its complex conjugate y* 
rotates according to the complex conjugate representation of the Pauli matrices, 


R: œ > (Ry) = Ry". (38.62) 


The conjugate Pauli spinor ¢ is defined by (despite the similar notation, the conjugate spinor ¢ is not the 
reverse spinor @ defined by equation (13.129); rather, the reverse spinor coincides with the row conjugate 
spinor Y = @- defined by equation (38.68); note that the conjugate overbar ~ is slightly smaller and thinner 
than the reverse overbar ~; but in any case, it should be clear from the context whether the conjugate or 
reverse spinor is intended) 


ale (38.63) 


6 
III 
m 

aS) 


The 3D spinor metric tensor € was constructed earlier to have the property (38.25) that commutation with € 
converts orthonormal basis vectors Ya of the geometric algebra to minus their transposes. The spinor metric 
tensor £ has the additional property that commutation with it converts even (but not odd) orthonormal 
basis elements 1 and I3y, of the geometric algebra to their complex conjugates (with respect to i) in the 
Pauli representation (13.112). Consequently commutation with £ converts rotors R, which are real linear 
combinations of the even orthonormal basis elements, to their complex conjugates, 


eR* = Re, (38.64) 


which also implies that €R = R*e, since a rotor R is a real linear combination of even orthonormal basis 
multivectors, so the complex conjugate R* of a rotor R is a rotor. It follows that the conjugate Pauli spinor 
Ø rotates in the same way as the spinor y, 


R: @=ey* > eR** = Re* = RE. (38.65) 
In components, 


= a*k= 


P=P Ea, Eq = EECa = FE, (38.66) 


where the index @ is the bit-flip of the index a, and the + sign is — for fî and + for |, that is, €€} = —e, and 


EE, = €. 
Conjugating a Pauli spinor y twice changes its sign, 


@=-y. (38.67) 
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38.13 Scalar products of spinors and conjugate spinors 


The row conjugate Pauli spinor Ø- corresponding to the column conjugate spinor Ø coincides with the 
Hermitian conjugate spinor y', which in turn coincides with the reverse spinor @, equation (13.129), 
@=GPle=plelc=—yl=G. (38.68) 


Note that the reverse spinor Y equals the row conjugate spinor ¢-; the reverse spinor Y does not equal the 
column conjugate spinor ¢ defined by equation (38.63), and the two should not be confused. 

The scalar product of a row conjugate Pauli spinor Ø- with a column Pauli spinor x coincides with the 
product of the Hermitian conjugate spinor yt with the spinor x, 


+ 
- * X 
@-x=lx=( gl p) ( sal ) =p yt + pX. (38.69) 
In particular, the scalar product %- y of a spinor with its own conjugate is real and positive, 


@-p=yly. (38.70) 


The complex conjugate of the scalar product satisfies 


(P: x)" = (ep) ex)" = gle X= -y'ex = -p X - (38.71) 


Ws 


The sign flip in the fourth expression occurs because the spinor metric tensor £ is antisymmetric, € —E. 


In particular, the complex conjugate of the product @- y of a spinor with its own conjugate is 
(p-p) =-p: P. (38.72) 


Equation (38.72), along with the condition that the scalar product be real, (¢-y)* = 9- p, equation (38.70), 
requires that the scalar product @- y be anticommuting, 


P-p=—-p-@. (38.73) 


Equation (38.73) proves that the scalar product of Pauli spinors must be anticommuting, as asserted earlier, 
equation (38.42). 

In non-relativistic quantum mechanics, the real positive scalar (38.70) is interpreted as the probability 
of the Pauli spinor y. Since conjugating a Pauli spinor twice flips its sign, equation (38.67), the scalar 
product (38.70) is the same regardless of whether the spinor y or its conjugate ¢ is taken: 


P:-P=-p: P=- p. (38.74) 


Concept question 38.2. Imaginary spinor metric? Would making the spinor metric £ imaginary allow 
the spinor scalar product to be commuting instead of anticommuting? Answer. No. If the spinor metric € 
were multiplied by i, or more generally by some arbitrary complex phase (which is possible since the spinor 
metric is defined only up to a scalar normalization factor), then the conjugate spinor must be defined by 
9 = e*y* in place of the definition (38.63) in order that the scalar product of the spinor and its conjugate 
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remain real and positive, equation (38.70). A manipulation similar to equation (38.71) carries through, with 


the result that equation (38.72) continues to hold regardless of any complex phase in spinor metric e. The 


T 


minus sign comes from € = —e regardless of any complex phase. The scalar product of Pauli scalars is 


necessarily anticommuting. 


38.14 Conjugate multivectors 


Conjugate multivectors @ in the super geometric algebra are defined, similarly to conjugate Pauli spinors, 
such that their components are complex conjugates of the parent multivector a, and they rotate in the same 
way as multivectors (the conjugate multivector @ is not the same as the reverse multivector @; note that the 
is slightly smaller and thinner than the reverse overbar 7). 


conjugate overbar 
The complex conjugate multivector a* of a multivector a = a4, is defined to be 


a aay, , (38.75) 


where y% is the complex conjugate of the basis multivector y4 in the Pauli representation. The spin basis 
vectors y+ and ‘y3 are real in the Pauli representation, which is consistent with the basis spinors €a being 
taken to be real, equation (38.61). Since a rotates as a > RaR, the complex conjugate a* rotates as 


R: a* + (RakR)* = R*a*R* . (38.76) 


Complex conjugation commutes with the isomorphism between multivectors and outer products of spinors 
in the super geometric algebra. That is, if the multivector is an outer product of spinors, a = yy-, then the 
complex conjugate multivector is the outer product of the complex conjugate spinors, a* = y*y*-. 

Similarly, consistent with the definition (38.63) of the conjugate spinor y*, the conjugate multivector @ is 
defined by 


a=ca*e. (38.77) 


If the multivector is an outer product of spinors, a = px, then the conjugate multivector is the outer product 
of the conjugate spinors, @ = ØX. Like the conjugate spinor, equation (38.65), the conjugate multivector @ 
rotates in the same way as a multivector, 


R: a=ca*e | 4 cR'a* Rc | = Rea*e tR = Rak. (38.78) 
In the Pauli representation, the conjugates of the orthonormal basis vectors yz are minus themselves, 
Fa =eyte | = Ya (38.79) 
The conjugate of a grade-p multivector a is, in components, 


a=a^""]4, Fa = (PYA . (38.80) 
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38.14.1 Real subalgebra 


In the Pauli representation, the basis vectors y+ and -y3 in a spin basis are real, equations (38.7) and (38.8), 
and the basis spinors eù are similarly real, equations (38.12). One might therefore contemplate forming a 
real subalgebra of the super geometric algebra from real linear combinations of these basis spinors and their 
products. This does not work however, because spatial rotations transform the basis spinors into complex 
combinations of each other, equation (13.120). Any viable real subalgebra must be closed under rotations. 

Orthonormal basis multivectors on the other hand do transform into real linear combinations of each other 
under rotations. A real subalgebra of the geometric algebra may be obtained by restricting to multivectors 
satisfying the reality condition that they are their own conjugates, 


a=a. (38.81) 


Since conjugates of even and odd orthonormal basis vectors y4 are respectively plus and are minus them- 
selves, equation (38.80), in the Pauli representation there is a real subalgebra consisting of linear combinations 
a^ya of odd orthonormal multivectors with pure imaginary coefficients, and even orthonormal multivectors 
with pure real coefficients. But in the Pauli algebra the (odd) pseudoscalar T; is identified with i times the unit 
matrix, so the real Pauli subalgebra reduces to real linear combinations of even orthonormal multivectors. 


38.15 The super geometric algebra in arbitrarily many spatial dimensions 


Exercise 38.3. Generalize the super geometric algebra to an arbitrary number of dimensions. 
Generalize the super geometric algebra to an arbitrary number of spatial dimensions N. Exercise 39.5 gen- 
eralizes this exercise to an arbitrary number of space and time dimensions. 
Solution. 
1. Basis of spin vectors Ya. Let Ya, a = 1,...,N be an orthonormal (Ya ` Yè = dap) basis of vectors 
in the N-dimensional geometric algebra. Group the basis vectors into pairs. The following complex 
combinations of the pairs define a basis of spin vectors yx, 


4) 


=z. 


Y+: = FMi- Hiqi), Y= AMi- — iyi), i= 1,..., [N/2], (38.82) 


generalizing equations (38.1). If the dimension N is odd, then one basis vector, yy, will remain unpaired. 


Under a right-handed rotation by angle 0 in the -yo;_1—yo; plane, the tth pair of spin basis vectors 


y+, transform as 


ye, > eToys. . (38.83) 


Ei 


The transformation (38.83) identifies the spin basis vectors y+, as having ith spin weight equal to +1. 


š 


All other spin basis vectors, y+, with j 4 i, together with the unpaired basis vector yy if N is odd, 


Ej 
have zero i’th spin weight. There are [N/2] different spin weights i. The components of a tensor in a 
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spin basis inherit their spin properties from those of the spin basis. The 7’th spin-weight s; of any tensor 
component is 


spin weight s; = number of +; minus —; covariant indices , (38.84) 


generalizing equation (38.6). 
The geometric algebra, Chapter 13, generated by inner and outer products of the N basis vectors Ya 
is a vector space of dimension 2%. 


[N/2] 


. Basis of spinors €a. Spinor axes are defined by 2 basis spinors €a, 


Ea = Sai. te (38.85) 


where a1...ajy/2} denotes not a set of indices, but rather a bitcode specifying the single index a. Each 
bit a; is either up ¢ or down |. For example, one of the basis spinors is the all-up basis spinor €}+..4. 
Under a right-handed rotation by angle @ in the -y2;_1—y2; plane, a basis spinor €a transforms as 


Epi TE Ets Eudes T ar E Jess (38.86) 


The transformation (38.86) shows that each basis spinor €a has i’th spin weight either +4 or -4 in each 
of its [N/2] bits. The components of a spinor tensor in a spin basis inherit their spin properties from 


those of the spin basis. The ith spin-weight s; of any spinor tensor component is 
spin weight s; = (number of +; minus |; covariant indices) , (38.87) 


generalizing equation (38.14). 
A spinor 9, 


p= peg , (38.88) 


is a linear combination of the 210/2] 


basis spinors €a. The spinor can be represented as a column vector 
yp" of dimension 210/2], the index a running over bitcodes aj ..-A[N/2]- 

. Spinor metric tensor. A spinor metric ¢ can be defined as that spinor tensor that is invariant under 
rotations, suitably normalized, §38.6. Invariance of the spinor metric £ under rotations requires that for 


any rotor R, 
eR! =Re, (38.89) 


the same as condition (38.18). A rotor R is a real linear combination of even elements of the geometric 
algebra in an orthonormal basis. Thus the condition (38.89) is determined by the commutation prop- 
erties of € with the orthonormal bivectors of the geometric algebra (an orthonormal bivector is defined 
here to be a wedge product of orthonormal vectors; the square of an orthonormal bivector is thus —1). 
In the canonical chiral representation defined by the construction (38.109), orthonormal basis bivectors 
Ya NM are represented by traceless, unitary (A~! = A‘), skew-Hermitian (At = —A) matrices. Then 
condition (38.89) holds if € commutes with orthonormal basis bivectors whose representation is real, 
and anticommutes with orthonormal basis bivectors whose representation is imaginary. In the construc- 
tion (38.109), all chiral basis vectors y+, are real, so orthonormal basis vectors ~y2;-1 are real while -y2; 
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Table 38.1: Symmetry of spinor metric 


N e= (—)[0 +1974] oan = (—)[0Y+2)/4] E2 = (aya oan = (jl +8)/4) 


a + ii = 


ONAnRwWNe 
SREPPPERT 
© 
a 
oo 
SS SS SSeS 
l 
l 
l 
| 
T 


are imaginary. The only matrix ¢ with the required commutation properties with basis bivectors is, up 
to a scalar or pseudoscalar normalization factor, the product of all the odd basis vectors -yo;_1, 


[(N-+1)/2] 
e= [|| va. (38.90) 
{=L 


An alternative version Eal of the spinor metric may be obtained by multiplying the spinor metric (38.90) 
by the chiral factor xy, which is the pseudoscalar Iy, equation (38.121), normalized by a power of i so 
that x2, equals one, equation (38.124), 


[N/2] 
Eat = uNe = || iva. (38.91) 
i=1 
The factors of the imaginary i are introduced so that the spinor metric e€ is real. 

If N is odd, and if the odd algebra is constructed, as described in part 10 of this Exercise, by 
embedding the odd algebra in one extra dimension and treating either the final (odd) dimension yy or 
the extra (even) dimension -yy+1 as a scalar, then there are further options for the spinor metric. The 
invariance condition (38.89) need hold only for rotors not involving the scalar dimension yy or yn +41. If 
the scalar dimension is the odd dimension yy, then yy can be dropped from the standard spinor metric 
£, leaving £ in N—1 dimensions. If the scalar dimension is the even dimension yy+1, then iyn+1 can 
be adjoined to the alternative spinor metric £a, giving Eam in N+1 dimensions. The resulting spinor 
metrics, distinguished with a tilde, are 


EN =ENYN =EN-1, Ealt,N = Ealt, NÎYN+1 = Ealt,N+1 (N odd). (38.92) 


The spinor metric €, in any of the forms (38.90)—(38.92), is real and orthogonal, and its square is plus 
or minus the unit matrix, 


eta=cl, 2=+1, (38.93) 
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Table 38.2: Sign of yde = +e7 


N c: (—)[0Y+8)/2] Ealt : (—)[N/2] é: (—)[0Y+2)/2] Šat : (—)[0+1/2] 


-— - + + 


where the + sign is as tabulated in Table 38.1. The square of the spinor metric coincides with the 
symmetry of the spinor metric under exchange of its indices, equation (38.98) below. The spinor metric 
matrix £ is always Hermitian, 

else, (38.94) 


Despite the equality of ¢ and |; Y2:—1 (or of eax and [J], iy2:) in the representation (38.109), € (or Eait) 
is defined to transform as a spinor tensor under rotations, not as an element of the geometric algebra. 
In the representation (38.109), the ordering of rows or columns indexed by spinor index a = a1...a{N/2] 
is that of binary numbers a,y/q)-..a1 with 0 for up f and 1 for down |. The components Eba of the spinor 
metric €, 


Eba = €b ` Eg = ELEEg i (38.95) 


are non-vanishing only between basis spinors €, and €, that are bit flips of each other. The sign of eg, 
where a denotes the bit flip of a, follows inductively from equations (38.107), and is 
c€, = Sign(Esa)éa , sign(€aa) = sign(€a,...a,y/a)01..-a~v/2)) = II (y (38.96) 
ai=f 


For the alternative spinor metric (38.91), the sign is 
Ealt€a = sign(e?t)ea , sign(e?t) = sign (e?! )= II (-)*. (38.97) 


1---@[N/2]%1---@[N/2] 
ai=f 


The spinor metric is symmetric or antisymmetric as its square is positive or negative, 


Eab = £€ba ; (38.98) 


where the + sign is as tabulated in Table 38.1. 
Commuting the spinor metric € through the orthonormal basis vectors Ya converts them to plus or 
minus their transposes, 


yie= tere . (38.99) 
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Table 38.2 tabulates the sign in equation (38.99) for the spinor metric £ and the alternative spinor metric 
Ealt, along with the tilde’d versions (38.92) for odd N. For tilde’d spinor metrics, equation (38.99) holds 
for all orthonormal basis vectors Ya excepting the scalar vector yy or Yn+1, for which there is an extra 


minus sign, that is, ye = — +ĉyy if the scalar dimension is yy, or Yy 41Ēalt = —+ĉat YN +1 if the scalar 
dimension is yy+1. Equation (38.99) is proved by induction: equations (38.110) and (38.115) imply that 
if (38.99) has a certain sign in N—2 dimensions, then it has the same sign in N dimensions; the sign is 
then determined at the smallest dimension for which the spinor metric € is defined, N = 1 or 2. 

Equation (38.99) implies that the commutation rule of an orthonormal multivector y4 of grade p with 
the spinor metric € is 


yje = (+74 = (£)?(-)PPleya , (38.100) 


where 7; is the reverse (not conjugate) of y4, and the + sign in (+)? is that in equation (38.99), which 
depends on dimension N as tabulated in Table 38.2. 


Scalar product of spinors. Corresponding to any column basis spinor €a is a row basis spinor €a’ 


defined by 
Ea E ELE. (38.101) 
(or by €a: = €déait if the alternative spinor metric is used). The row spinor g- corresponding to a column 
spinor Y = p% €a is 
P: = p'E = p'ea: (38.102) 
The scalar product of row and column spinors is 
P- X = Eabp? Xt . (38.103) 
The scalar product is symmetric or antisymmetric as the spinor metric is symmetric or antisymmetric, 
yp x=e' xy, (38.104) 


the sign of £? being as given in Table 38.1. 
Linear combinations of outer products €,€,- of basis spinors, 


PX: = yr yeas: , (38.105) 
form a vector space of dimension 22N/2], Multiplication of outer products satisfies the associative rule 


(ox (HE) = (x YE: (38.106) 


which since y - ~ is a scalar is proportional to the outer product yé-. 


. Chiral representation of the super geometric algebra. There is an isomorphism between the al- 


gebra of outer products of spinors and the geometric algebra (Brauer and Weyl, 1935). The isomorphism 
may be established by an explicit representation in terms of column and row vectors for spinors, and 
matrices for multivectors in the geometric algebra. This part 5 of this Exercise takes the spinor metric 
to be the standard spinor metric £, equation (38.90). The next part 6 of this Exercise describes the 
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modifications that must be made if the spinor metric is taken to be the alternative spinor metric Ealt, 
equation (38.91). 


The construction below yields the chiral representation, generated inductively starting from N = 0. 
Given a representation of column and row basis spinors €4 and €,4- in N—2 dimensions, a representation 
of column and row basis spinors €4q and €4a: (with one extra index a = ¢ or |) in N dimensions are 
column and row matrices of length 20/2, 


en = ( 7 ) 5 eat: = (0 EA” ) 5 (38.107a) 
0 
ais ( kA ) , ea =( (eg 0), (38.107b) 


where 0 represents respectively a zero column or row vector of length 20 -2)/2, and the index N/2 on t 
and | has been dropped for brevity. The induction starts at N = 2 where A is empty and e4 = €,4: = 1. 
The trailing dot signifies the spinor metric tensor £. The construction (38.107) assumes that the spinor 
metric € is a product (38.90) of factors, the last factor yy_, taking the form (38.113), so that the 
relation between the spinor metric in N and N—2 dimensions is given by equation (38.115). 


The outer products of the column basis spinors € 4, and row basis spinors €g: given by the inductive 
relations (38.107) are 2/2 x 20/2 matrices 


EAtest: = ( p ea ) ; (38.108a 
ices =( ees : ) , (38.108b 
EALEBt: = ( s ae ) ; (38.108c 
eaen, = ( (J -D/eaep. ; ) (38.108d 


where the 0’s in equations (38.108) represent zero 2(N—?)/? x 2(N=2)/2 matrices, and the index N/2 on 
+ and | has again been dropped for brevity. Again, the induction (38.108) starts at N = 2 where A and 
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B are empty, and €4€g- = 1. The outer products (38.108) can be rewritten 


1 EAEB’ 0 0 V2 
es .1 
SATEB V2 ( 0 TEJEp“ ) ( 0 0 ; ce one 


1 / eyez: 0 2 0 
x . — (—)(N-2)/2 = ACB 
EAteB|: = (—) 5 ( 0 Tinea ) ( 0 0 ) ; (38.109b 
1 / eaeB: 0 0 0 
+. = +- .1 
SALER 2 ( 0 rE AEB: ) ( 0 2 ) i S Ose 


1 EAEB: 0 0 0 
s a Oa) ee FASB 38.109d 
ee a, = ( 0 vases) alt ae ee: 
where the upper/lower sign is for even/odd eyeg- (that is, the total spin weight 5°, s; of e€4eg- is 
even/odd). The first matrix on the right hand sides of equations (38.109) is the matrix representation 
of the multivector €4€g- in N dimensions in terms of its representation in N—2 dimensions, 


EAEB’ 0 
ale ` E EAĻEBT = 0 TEAEB': f 


(er 


E€AeRp = (38.110) 


The rightmost factors in equations (38.109) constitute the matrix representations of y+, Y+Y-, Y-Y+, 
and y- in N dimensions, 


w= (5 ae n= (5 ar w= (5 Oe =h fy ae (38.111) 


which have the correct normalization and commutation rules with respect to each other. The signs in 
equations (38.109) are arranged so that the correct commutation rules of the geometric algebra are 
respected: y+ and -y_, which are odd, commute/anticommute with €4€g- according as the latter is 
even/odd; and y;7y_ and y_7y+, which are even, always commute with €4e€,-. In terms of scalar and 
wedge products, the multivectors y;-y_ and -y_-y; in equations (38.111) are 


1 0 1 0 
YY =Y Y- EVAL, memelo i) ; war=(4 =) ; (38.112) 
Note that y4 Ay- = —iyyn-1 A Yy, 80 that (y+ Ay-)? = 1. The orthonormal basis vectors yy—1 and 


yy at the (N/2)’th step are 


0 1 0 —i 
ama=( 4 ar w=($ ae (38.113) 


which are traceless, unitary, and Hermitian. The orthonormal basis bivector yy— 1 Ayn is 


i 0 
yn-1 ^YN = ( 0 ) ; (38.114) 


== 
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which is traceless, unitary, and skew-Hermitian. An iterative expression for the spinor metric £y follows 
from its expression (38.90) as a product of basis vectors, and is the antidiagonal matrix 


_ _ EN-2 0 0 1 = 0 EN-2 
EN = EN-2YN-1 = 0 (—)N-2)/2ey_ 9 1 0 = (-)(N-2)/2en_» 0 : 


(38.115) 
The left factor in the third expression of equations (38.115) is the matrix representation of €y- in 
N dimensions in terms of its representation in N—2 dimensions, in accordance with equation (38.110). 
The factor of (—)(%~?)/? comes from the fact that the spinor metric ey—2 is a product of (N—2)/2 
basis vectors, equation (38.90), so is even/odd (total spin weight even/odd) as (N—2)/2 is even or 
odd. Equation (38.115), which was assumed in the initial step (38.107) of the construction of the chiral 
representation of the super geometric algebra, proves the consistency of the construction. 

The matrix representation of the column and row basis spinors (38.107) and of their outer prod- 
ucts (38.109) is entirely real (with respect to i). The expansion of the 2% outer products €a€,: of spinors 
in terms of the 2% basis multivectors y4 of the spacetime algebra, and vice versa, define the matrix of 
coefficients y4 and its inverse 74, 


Ean = Vena, YA = Var Eao: - (38.116) 
The coefficients yâ and q% in the chiral representation are 


1 
A _ A ab _ ~ 2\_a b 
Vab = aNg E YV Eas Ya = sign(e”)€"- YAE” , (38.117) 
where sign(e?) is the symmetry of the spinor metric, Table 38.1. 
For even N, the above construction establishes an isomorphism between outer products of spinors 
and the geometric algebra, 


~N 


outer products of spinors ~ geometric algebra (N even) . (38.118) 


Both spaces are complex 2'-dimensional vector spaces. Their representation as 2/2 x 2N/? dimen- 
sional matrices is minimal: there is no representation of the geometric algebra with matrices of smaller 
dimension. 

. Chiral representation of the super geometric algebra using the alternative spinor metric. 
The chiral representation of the super geometric algebra with the alternative spinor metric (38.91) is 
the same as the construction in part 5, but with the replacement 


(0 a (38.119) 


in equations (38.107) to (38.110). Analogously to equation (38.115), an iterative equation for the al- 
ternative spinor metric follows from its expression (38.91) as a product of basis vectors, and is the 
antidiagonal matrix 


ge es ealt 0 0 1 0 en 
Notai S Cosine, )(A 0) = (ama, o) oa 
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Super geometric algebra in odd dimensions, version 1. The construction of the super geometric 
algebra in part 5 works in even dimensions N. What about N odd? One approach, dealt with in this 
part, is to project the odd-dimensional algebra into one lower dimension, which requires identifying the 
chiral operator xy with 1, equation (38.125). The resulting algebra of outer products of spinors, besides 
not yielding the full odd-N geometric algbra, does not include a parity operator. A richer approach, put 
forward in part 10, is to embed the odd-dimensional algebra in the algebra with one higher dimension, 
and to treat the extra dimension as a scalar, which proves to be a parity operator. 
Consider that the pseudoscalar Iy of the geometric algebra can be written 


INEA.. Ayn = ily , (38.121) 
where the chiral operator xy (the generalization of the 4D Dirac chiral operator ys) is defined by 
AN E Ypa AY NEI a re iE N is odd} . (38.122) 


In the chiral representation (38.109), the representation of the chiral operator xy in N even dimensions 
in terms of its representation xy—2 in N—2 dimensions is the diagonal matrix 


ies ( ia ) (N even) . (38.123) 
0 —H4N-2 


The chiral operator is diagonal in the chiral representation by construction. The square of the pseu- 
doscalar is [%, = (—)[N/2], equation (13.21), so the square of the chiral operator is the unit matrix 1, 


xy =l. (38.124) 


Like the pseudoscalar Iy, the chiral operator xy is invariant under rotations. For even N, the chiral 
operator xy is defined through equation (38.122) as a prescribed member of both algebras, the algebra of 
spinor outer products and the geometric algebra. But for odd N, since the definition (38.122) involves yy 
which (as yet) has no expression in the algebra of outer products of spinors, there is the possibility that 
xy could be a distinct element not belonging to the algebra of spinor outer products. The element xy 
is a rotationally invariant scalar that squares to 1, and that (for odd N) commutes with all basis vectors 
Ya- The other element of the odd-N algebra of spinor outer products that possesses those properties is 
(up to a possible sign) the unit element. Thus if the chiral operator x,y is identified with 1, 


uy =1 (N odd), (38.125) 


then there is an isomorphism between the algebra of outer products of spinors in N—1 dimensions and 
the geometric algebra in N dimensions modulo the chiral operator xy, 


[as] 


outer products of spinors ~ geometric algebra (mod xy) (N odd). (38.126) 


Given the identification (38.125) of the chiral operator with 1, it then follows from the definition equa- 
tion (38.122) of xy that the final element yy of the geometric algebra is 


YN = 2 = Yh NIG AeA gg A, (N odd). (38.127) 
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In the case N = 3, this gives 


1 0 
nins a) 5 (38.128) 


in agreement with the Pauli matrix equation (38.8). With the identification (38.125), the pseudoscalar 
Iy itself is, equation (38.121), 


Iy = il? (N odd). (38.129) 


For odd N, the chiral operator xy defined by equation (38.122) is (before xy is identified with 1) 
an odd element of the geometric algebra. Thus for odd N, the odd part of the geometric algebra is 
isomorphic to xy times the even geometric algebra. Only the odd geometric algebra is affected by the 
identification (38.125) of the chiral operator with unity; the even geometric algebra is unaffected. The 
square of the chiral operator is always 1, equation (38.124), so the product of two odd multivectors 
yields the correct even multivector regardless of the identification (38.125). 

The imaginary i was introduced already in the very first step (38.82) of the construction of the super 
geometric algebra. One might ask where that imaginary came from? An intriguing observation is that 
if N is odd and [N/2] is odd (thus N = 3, 7, 11, ...), then the pseudoscalar Iy squares to —1 and 
commutes with all elements of the geometric algebra, just like the imaginary 7. One might take the view 
that maybe that’s where i comes from. Taking the view that Iy is indeed the imaginary is equivalent to 
indentifying the chiral operator xy with unity, equation (38.125), in which case i is, up to a sign, the 
pseudoscalar Iy, equation (38.129). 

In summary, the algebra of spinor outer products in 2[N/2] dimensions is isomorphic to the geometric 

algebra for both even and odd N, modulo xy in the case of odd N. The algebra is a complex (with 
respect to i) vector space of dimension 27(/*1, represented in the chiral construction (38.109) by 20/21 x 
2[N/2] matrices. For example, the N = 2 geometric algebra is the complex vector space generated by 
1, Y+, Y-, Y+ ^ Y-, while the N = 3 geometric algebra (the Pauli algebra) is the complex vector space 
generated by 1, Y+, Y-, Y3, the pseudoscalar [3 being identified with the imaginary i. 
. Extra symmetry of the super geometric algebra in odd dimensions. Given that, if x) is 
identified with 1, the geometric algebra for odd N is isomorphic to the geometric algebra for even 
N-1, what is the difference between the two algebras? Since the algebras are isomorphic, there is of 
course no difference. However, bivectors are special in that they are the only generators that generate 
transformations that preserve grade, and therefore correspond to what one usually thinks of as spatial 
rotations. If one restricts only to rotations generated by bivectors, then the odd algebra has a higher 
degree of symmetry. The equivalence (38.127) means that the pseudoscalar xy—1 in the even algebra 
is promoted to a vector yx in the odd algebra, and pseudovectors Ya%y-—1ı in the even algebra become 
bivectors Ya Ayn in the odd algebra. Thus the odd algebra has N—1 more rotations than the even 
algebra. 

The final basis vector yy = xyn_, of the odd algebra has the same properties as the other orthonormal 
basis vectors yı to Yy_—1: its square is 1, it anticommutes with the other orthonormal basis vectors, it is 
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represented by a traceless, unitary, Hermitian matrix, and its reverse is (by definition) itself, Yy = Yn- 
And, like the other orthonormal basis vectors y2;_1 of odd index, the representation of yy is real. 
The Pauli algebra (13.115) in N = 3 dimensions offers a familiar example. In both 2 and 3 dimensions 
there are just 2 basis spinors, €} and e€}, which one commonly conceptualizes as being up and down 
along a “3-axis”. But whereas in 2 dimensions there is just one rotation, generated by the bivector y1 A %2 
(rotation about the “3-axis”), in 3 dimensions there are 2 more rotations, generated by the bivectors 
Y2 A Y3 and yz A Ņı (rotations about the “1l-axis” and “2-axis”). 
Parity reversal. A second approach to the odd-N algebra is put forward in the next part 10, but first 
it is necessary to consider the issue of parity reversal. Parity reversal is the operation of reflecting an odd 
number of spatial axes Ya, corresponding to an improper rotation with determinant —1. By contrast, 
reflecting an even number of axes can be accomplished by a continuous rotation with determinant 1. 
If the number N of dimensions is even, then parity reversal may be realised by picking one particular 
axis, say P = yy, and transforming spinors % and multivectors a by 


P: y> Py, a> PaP. (38.130) 


The transformation (38.130) reflects all axes except the axis P = yy, so reflects an odd number of axes 
provided that N is even. 

If the number N of dimensions is odd, and if the geometric algebra is projected into one dimension 
lower as proposed in part 7, equation (38.125), then there is no element of the geometric algebra that 
accomplishes parity reversal P by the operation (38.130). The difficulty is that any anticommutation of P 
with a basis vector Ya is cancelled by a corresponding anticommutation with the final basis vector yy « 
‘1..-Yn-1, for no net anticommutation. The absence of a parity operator in the geometric algebra holds 
true even if the odd-dimensional chiral operator zy is not identified with unity, since all vectors commute 
with the odd-dimensional chiral operator. The problem of constructing an odd-N super geometric algebra 
that incorporates a parity operator is solved in the next part 10. 

Super geometric algebra in odd dimensions, version 2. The previous part 9 brought up the fact 
that the geometric algebra in odd N dimensions does not contain a parity operator P, at least if the 
path proposed in part 7 is followed, that is, if the odd-N algebra is projected into one lower dimension. 

The problem is not that the operation of parity reversal does not exist, but rather, how to construct 
such a parity operator out of products of spinors. 

The solution is to embed the odd N-dimensional algebra in the even (N+1)-dimensional algebra, and 
to treat either the final (odd) dimension Yyy or the extra (even) orthonormal dimension yy 41 as the 
scalar parity operator P, 


P = yN Or Yn41 - (38.131) 


The vectors yy or Yy +1 have the usual property that they anticommute with all orthonormal vectors 
Ya other than themselves, so the parity operator P defined by equation (38.131) has the property that 
it reflects all axes except itself, 


P: Ya PP = oa . (38.132) 
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Since N is odd, this choice of P reflects an odd number of axes, so indeed reverses parity. The operation 
P of reflecting all axes (other than the scalar axis yy or yy +1) is rotationally invariant with respect to 
rotations in N dimensions (with the scalar axis yy or Yn4+1 fixed). 


As usual, there is a spin bit (the [((N+1)/2]’th bit) associated with the pair yy and yy 4, of axes. 
10/2 with sign F de- 
pending on whether the spin bit is up ¢ or down |. But since P is a scalar, there is no such rotation. 


Normally a rotation in the yy Ayy+1 plane would rotate spinors by a phase e7 


Notwithstanding the absence of a rotation by a phase, the spin bit is still there, part of the bitcode 
index a = @)...@{(y41)/2] Of a basis spinor €a. 


Properties of orthonormal basis multivectors in the chiral representation. In the chiral rep- 
resentation constructed in part 5, all orthonormal basis vectors yz, and all orthonormal basis p-vectors 
Yar...ap = Ya; A A Yap» are traceless (except for the unit basis element 1), unitary, and either Hermitian 
(if [p/2] is even, i.e. p = 0, 1, 4, 5, ...) or skew-Hermitian (if [p/2] is odd, i.e. p = 2,3, 6, 7, ...) 2'N/2] x 21/21 
matrices. All matrices have determinant 1, except that for N = 2 the vectors (grade p = 1) have de- 
terminant —1. The unit element is represented by the unit matrix. Most of these assertions can be 
proved by induction using the expression (38.110), which gives the representation of a multivector in N 
dimensions in terms of its representation in N—2 dimensions. 


Right- and left-handed chiral subalgebras in even dimensions. In even N dimensions, a spinor 
is said to be right- or left-handed depending on whether its chirality is even or odd. A basis spinor €a is 
right- or left-handed depending on whether the number of spin flips of the index a = a1...ajy/2], relative 
to the all-up index Îf ... f, is even or odd, 


utNEq = ( Il (-) ea (38.133) 


ai=} 


In other words, a basis spinor €, is right- or left-handed as the number of down | indices is even or odd. 
In even N dimensions, the chirality of a spinor is invariant under rotations. 


In odd N dimensions, if the path proposed in part 7 is followed, where the algebra is projected into 
one lower dimension, which requires identifying the chiral operator xy with unity, equation (38.125), 
then rotations mix right-and left-handed spinors, and chirality is not a rotationally invariant property 
of spinors. 


If on the other hand N is odd and the path proposed in part 10 is followed, where the algebra is 
embedded in one higher dimension, then a basis spinor €a has [(N+1)/2] bits, and its chirality is that 
of the algebra in one higher dimension. The chirality operator is xy+ 1. In the rest of this part of this 
Exercise, replace N by N+1 if N is odd and part 10 is followed. 


Right- and left-handed chiral multivectors are eigenvalues of the chiral operator xy (or xy +1 if N is 
odd and part 10 is followed), with eigenvalues +1, 


xNAR = EAR . (38.134) 
L 
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Right- and left-handed chirality projection operators Pp may be defined by 
L 


PR 
L 


Latxy) = 2047 Pry) , (38.135) 


which are projection operators because their squares are one, (P)? = 1, and their product is zero, 
PP = 0. A multivector a splits into right- and left-handed chiral parts, 
a =ar +a, Gp =Ppra. (38.136) 
L L 
Since the chiral operator xy is proportional to the pseudoscalar Iy, a purely right- or left-handed 
multivector is necessarily a linear combination of a multivector and its Hodge dual. 

An outer product of a right-handed column spinor with any row spinor (right- or left-handed) is a 
right-handed multivector. An outer product of a left-handed column spinor with any row spinor is a 
left-handed multivector. 

Equations (38.109) provide a matrix representation of the isomorphism between spinor outer products 
and multivectors. To make the split into right- and left-handed algebras more transparent, it can be 
convenient to permute the rows and columns of the matrices so that the chiral operator xy is rep- 
resented by the matrix with all positive diagonal entries +1 coming first, and all negative diagonal 
entries —1 coming last (for example, this is the ordering adopted for Dirac spinors in N = 4 dimensions, 
equation (39.20)), 


gees ( : a ) . (38.137) 


The 0’s and 1’s represent zero and unit 2lY/2]-! x 21N/2]-1 matrices. There are many ways to accomplish 
the permutation. Since the chirality of a basis spinor €a is right- or left-handed as the number of down 
bits in the index a is even or odd, equation (38.133), one possibility is to reorder the rows and columns 
on a Single bit, say the first bit a, of the index a, leaving the ordering with respect to all other bits 
unchanged. The ordering on the chosen single bit is such that the index with total number of down bits 
even (right-handed) joins the first 21/2171 indices, while the index with total number of down bits odd 
(left-handed) joins the last 2!%/?]—! indices. 

The result of the permutation is that the matrix representation of a multivector is block diagonal with 
all right-handed chiral multivectors in the top half, all left-handed chiral multivectors in the bottom 
half, and with all even multivectors on-diagonal and all odd multivectors off-diagonal, 


Reven Rodd ) 


Iti tor = 
ai ( Lodd Leven 


(38.138) 


The splitting into even and odd multivectors follows because the chiral operator xy commutes with all 
even multivectors but anticommutes with all odd multivectors. 

Pure grade components of spinor outer products. An outer product xy- of spinors is a multi- 
vector, and its grade p component may be denoted in the usual way, equation (13.27), 


(XP*)p - (38.139) 
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The trace of the outer product is the scalar product, 


Tr(xp:)=9-x.- (38.140 


The grade 0 component of the outer product yọ- is the scalar product -x multiplied by the 20/2] x 21/2 
unit matrix 1 normalized by the reciprocal of its trace, Tr 1 = 21/2] (the 1 in equations (38.141)—(38.143 


denotes the unit matrix), 


1 
(xe)o = (2: X) ota (38.141 


If a is a multivector of grade p, then the scalar sequence y- ay, multiplied by the normalized unit 
matrix, may be re-expressed as the scalar product of a with the grade p part of x>, 


(pax) — = (axy:)o = a+ (XP")p - (38.142) 


The Hodge dual of the grade p multivector a is Iya, and the scalar sequence y- Iyax, multiplied by 
the normalized unit matrix, may be re-expressed as the Hodge dual of the wedge product of a with the 
grade N—p part of yy., 


(v: Inay) sar = (Eva): (x0) -p = In (AA )W-y) - (38.143) 


Conjugation. The rotationally-invariant conjugation operator C is defined such that commutation with 
it converts rotors R in the chiral representation (38.109) to their complex conjugates (with respect to 
i) (compare equation (38.64)), 


CR*=RC. (38.144) 


Note that since a rotor R is a real linear combination of even orthonormal basis multivectors, the 
complex conjugate R* of a rotor R is a rotor. The complex conjugate y* of a spinor y is defined to be 
its complex conjugate (with respect to 2) in the representation (38.109), where the basis spinors €a are 
real column vectors, 


yp = p% €a . (38.145) 
The conjugate spinor ¢ of a spinor y = y“e, is defined by, equation (38.63), 
P = Cp* =Cp* ea . (38.146) 


The condition (38.144) on the conjugation operator C is imposed precisely so that the conjugate spinor 
Ø rotates under a rotor R in the same way as the spinor y, 


R: p= Cø > C(Re)* = CR = RCO = Rọ. (38.147) 


A necessary and sufficient condition for (38.144) to hold is that C commute with all real (with respect 
to i) orthonormal bivectors, and anticommute with all imaginary orthonormal bivectors. This is the 
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same condition that previously the spinor metric tensor € was required to satisfy, so C must equal £ (up 
to a possible normalization factor), 


[(N+1)/21 
C=e= || wr. (38.148) 
i=l 


If the alternative spinor metric (38.91) is used, then the conjugation operator is 


[N/2] 
Cae = ear = [| iva- (38.149) 
i=1 
Choosing C = e (or Cat = Eat) without any additional normalization factor ensures that the scalar 
product %-y of a spinor with its own conjugate is real and positive, equation (38.154). There is no loss 
of generality in imposing that £, hence C, be real. If € were multiplied by an arbitrary complex phase, 
then the conjugation operator would have to be defined by C = «* in place of the definition (38.148), in 
order that the scalar product of a spinor with its conjugate remain real and positive, equation (38.154). 
The modification by a phase leaves various key results unchanged; for example the double conjugate of 
a spinor, equation (38.151), becomes ¢ = C'C*y, which is unaffected by a complex phase in C. 
The conjugate of a basis spinor €a is 


Ea, = Ce, = +65, (38.150) 


where the conjugate index @ is the index a with all bits flipped. Conjugation flips the chirality of a spinor 
if [N/2] is odd, and leaves the chirality unchanged if [N/2] is even. The + sign in equation (38.150) is as 
given by equation (38.96), or by equation (38.97) if the alternative spinor metric is used. Conjugation 
flips all the bits of a spinor; for example, the conjugate of the all-up basis spinor is the all-down basis 
spinor, €y4,.+ = +€,)... The conjugate spinor ¢ of a spinor ¢ is, equation (38.146), p = yp €a. The 
double conjugate of a spinor is 


e=Co=e'0, (38.151) 


where the sign €? is as given in Table 38.1. 


The scalar product of a conjugate spinor Ø with a spinor x is 
Gx = (Cy*) "ex = gl Clex = fx, (38.152) 


which is a complex number. In particular, the scalar product of a conjugate basis spinor €, with a basis 
spinor €p is a Kronecker delta, 


Ea ` Ep = Oab - (38.153) 


The scalar product of a spinor y with its own conjugate is real and positive, 


@-p=yly. (38.154) 
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The scalar product of ¢ with its conjugate is the same as the scalar product (38.154) of y with its 
conjugate, 


G G=C’p- G=G-y. (38.155) 

The complex conjugate (with respect to i) a* of a multivector a = a“, is defined to be its complex 
conjugate in the chiral representation (38.109) of multivectors, 

a* =a“*y,. (38.156) 


In the representation (38.109), the spin basis vectors y+, (and the final vector yy if N is odd) are real, 
so the orthonormal basis vectors ‘y2;-; and yz; are respectively real and imaginary. The conjugate @ 
of a multivector a = a“, is defined to be, consistent with the definition (38.146) of the conjugate 
of a spinor (do not confuse the conjugate multivector @ with the reverse multivector @; the conjugate 


overbar ~ is slightly smaller and thinner than the reverse overbar 7), 
a = Ca* 07t}. (38.157 
The conjugate multivector @ rotates under a rotor R in the same way as the multivector a, 
R: a= Ca*07 > C(RaR)*C7! = CR*a*R C7! = RCa*C7'R = RāR . (38.158 


Conjugation is multiplicative over multivectors, and over multivectors with spinors, 


ab=ab, ap=āŅ. (38.159 
If the outer product of two spinors y and y equals the multivector a, then the outer product of conjugate 
spinors Ø and % equals the conjugate multivector a, 
PX: =a, PX =a. (38.160 
Equation (38.160) holds because (with C = £) 
OX: = ew" (ex*) le = ey" (x*)' =e(yy'e)*e 1 = cate 1 =ā . (38.161 
The conjugate of a basis multivector y4 is defined to be 


Ja = C40 , (38.162 


so that a conjugate multivector @ is 


a=a4*y, . (38.163 


The conjugate of an orthonormal basis vector Ya is 


Ya = ENa ; (38.164 


where the + factor depends on the choice of spinor metric, and is as given in Table 38.2. The conjugates 
of spin basis vectors y+, defined by equations (38.82) have their index flipped +; © —;, 


J4, = +75; 5 (38.165) 
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where the + factor is again as given in Table 38.2. 
Real subalgebra. The chiral matrix representations (38.107) of the column and row basis spinors €a 
and €,-, and (38.109) of their outer products (which yield the full set of basis multivectors in the chiral 
representation), are all real. One might therefore contemplate forming a real subalgebra consisting of 
spinors y%e, and multivectors afya with real coefficients y“ and a^ in the chiral representation. This 
does not work however, because spatial rotations transform the basis spinors (and their outer products) 
into complex combinations of each other, equations (38.86). Any viable subalgebra must be closed under 
rotations. 

Orthonormal basis multivectors on the other hand do transform into real linear combinations of each 
other under rotations. A real subalgebra of the complex geometric algebra may be obtained by restricting 
to multivectors satisfying the reality condition that they are their own conjugates, 


ā=a. (38.166) 


Conjugates of orthonormal basis vectors Ya are equal to either plus themselves, or minus themselves, 
depending on the choice of spinor metric, equation (38.164). If the conjugates of the orthonormal basis 
vectors are themselves, ¥, = Ya (+ in Table 38.2), then the real subalgebra consists of real linear 
combinations of orthonormal basis multivectors. If the conjugates of the orthonormal basis vectors are 
minus themselves, 7, = —Ya (— in Table 38.2), then the real subalgebra consists of linear combinations 
of odd-grade orthonormal multivectors with pure imaginary coefficients and even-grade orthonormal 
multivectors with pure real coefficients. 

A real super geometric subalgebra may similarly be obtained by restricting to spinors satisfying the 
reality condition that they are their own conjugates, 


P=. (38.167) 


The spinor reality condition (38.167) is more restrictive than the multivector reality condition (38.166). 
Whereas the multivector reality condition (38.166) can always be imposed, the spinor reality condi- 
tion (38.167) can be imposed only if the double conjugate spinor is itself, equation (38.151), which is to 
say, only if the spinor metric is symmetric, Table 38.1. 

If the self-conjugate condition (38.167) holds, then the relation (38.160) implies that outer products 
of self-conjugate spinors y and yx are self-conjugate multivectors, 


G@= 0X: =x: =a. (38.168) 


Thus the multivector part of the real super geometric subalgebra is the real geometric subalgebra 
corresponding to the reality condition (38.166) for a symmetric choice of spinor metric. 

Transformations that leave the spinor scalar product unchanged. The spinor metric ¢, hence 
the spinor scalar product, is by definition invariant under rotations, that is, under the rotor group gener- 
ated by bivectors of the geometric algebra. However, the geometric algebra contains multivectors of other 
grades, that generate other Lie groups of transformations of the algebra, Exercise 13.6. An element R of 
a Lie group generated by a set of orthonormal multivectors y4 takes the form R = exp(—5 4 GAYA); 
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where, depending on the choice of group, the coefficients 0 4 could be real or imaginary or complex. The el- 
ement R transforms multivectors a by R: a + RaR~', the inverse of R being R7! = exp (4 > 4 Oaya).- 
All such transformations preserve the scalar product of multivectors. However, not all such transforma- 
tions preserve the scalar product of spinors. 

Let R = e~°74/2 be a transformation generated by an orthonormal basis multivector y4, with 6 real, 
imaginary, or complex. The condition for the spinor metric £ to be invariant under the transformation 
R is that commuting the generator y4 through e should convert it to minus its transpose, 


VAE = EFA (38.169) 


For then R'e = e~974/22 = ce%V4/2 = eR, which implies that a scalar product y- x of spinors is 
invariant under R, 


(Ry) - (Rx) = p 'R'eRx = p'eR Ry = p- x. (38.170) 


Comparing the condition (38.169) to the actual commutation rule (38.100) shows that the grades of 
orthonormal multivectors that generate transformations that leave the spinor scalar product unchanged 
are, with the + or — from Table 38.2, 


+: grades (2 or 3) mod 4 (thus 2,3,6,7,...) , (38.171a) 
—: grades (1 or 2) mod 4 (thus 1,2,5,6,...) . (38.171b) 


For tilde’d spinor metrics, equation (38.100) holds, hence the list (38.171) holds, for multivectors y4 
that do not include a factor of whichever is the scalar dimension, yy or yn +1. If the multivector YA 
includes a factor of the scalar dimension yy or Yy 41, then equation (38.100) holds with an extra minus 
sign (the grade p being that of y4 including the scalar dimension), and the grades of generators that 
leave the spinor scalar product unchanged are the complement of those in the list (38.171). 

If the scalar product is between spinors and conjugate spinors, then whether a transformation R = 
e—°74/2 generated by a grade-p multivector y4 preserves the spinor scalar product depends on whether 
the coefficient 0 is real or imaginary. A conjugate spinor ¢ transforms under R as 


R: @=Cye* > C(Ry)* . (38.172 


The commutation rule (38.144) for rotors is replaced by 


~ 


CR* = Ce~? Val? — e78 Yal 20 = eT EO YAN20 | (38.173 


where the + sign in (+)?, from equation (38.164), is as given in Table 38.2. For rotors, which are 
generated by real linear combinations of bivectors, the grade p is 2, and @ is real, and equation (38.173 


recovers the commutation rule (38.144). The scalar product ¢- y of a conjugate spinor with a spinor 
transforms under R to 


(C(Ry)*) - (Rx) = Gew EVA 2eRy = gee- OP" 4/2 Ry | (38.174) 


where the sign (—)!?/?] in the final expression is the product of (+)? and the sign (+)?(—)!/?! in the 
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commutation rule (38.100) of a multivector y4 through the spinor metric e. The spinor product is 
preserved provided that e~(-)"’"8"74/2 = R-1, which is to say provided that 


— (-)P/lg* =6. (38.175) 


Therefore the scalar product of spinors and conjugate spinors is preserved under transformations gen- 
erated by multivectors of grade p provided that the coefficient 0 satisfies 


0 real grades (2 or 3) mod 4 (thus 2,3,6,7,...) , (38.176a) 
0 imaginary grades (0 or 1) mod 4 (thus 0,1,4,5,...) . (38.176b) 


Rotor group. Unimodular elements of the (even or odd) geometric algebra generated by the N(N—1)/2 
orthonormal bivectors Ya ^ %» form a group, the rotor group, also called the spin group, or Spin(V). 
The rotor group Spin(N) comprises all distinct rotations of spinors (spin-5 objects) in N dimensions, 
and is the double cover of the special orthogonal group SO(N), which comprises all distinct rotations 
of vectors (spin-1 objects) in N dimensions. 

As noted in part 11 of this Exercise, the chiral representation represents orthonormal basis bivectors 
Ya A Yp in even N dimensions by traceless, skew-Hermitian, unitary 2!N/?] x 21/2] matrices. The rotor 
group generated by the basis bivectors is then represented by unitary 21/2] x 20/2] 
the rotor group in even N dimensions is a subgroup of SU(2!/?!), the special unitary group in 2! 
dimensions, 


matrices. Thus 
N/2] 


Spin( N) c SU(2IN/21) . (38.177) 


The embedding (38.177) holds also if N is odd, since as described in part 7, in odd N dimensions the 
N-dimensional chiral operator xy can be identified with unity, equation (38.125), in which case the final 
odd vector yy is equivalent to the (N—1)-dimensional chiral operator xj _1, and bivectors Ya ^ Yy are 
again represented by traceless, skew-Hermitian, unitary 20/2] x 21/2] matrices. 

The generators of the unitary group are skew-Hermitian. The orthonormal basis multivectors of the 
geometric algebra are either skew-Hermitian (grades p = (2 or 3) mod 4) or Hermitian (grades p = 
(0 or 1) mod 4). Multiplying a Hermitian generator by i makes it skew-Hermitian. The set of 2% 
orthonormal basis multivectors in even N dimensions, with Hermitian multivectors multiplied by i, 
generates the full unitary group U(2!/2l). This is the group denoted G?°!(N) by Shirokov (2017), 


G23i01( y) = (21/2) , (38.178) 


If the generator consisting of 7 times the unit matrix is excised, the result is the special unitary group 
Sug). 

Grade-preserving subgroup of Spin(2N). The rotor group Spin(2N) contains a subgroup that pre- 
serves the spinor grade, the number of up bits, of a spinor (Atiyah, Bott, and Shapiro, 1964). The 
subgroup is isomorphic to U(N), so that 


SU(N) c U(N) C Spin(2N) . (38.179) 
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The generators of Spin(2N) that preserve spinor grade are bivectors with zero total spin. These gen- 
erators must be real linear combinations of orthonormal Spin(2N) bivectors that, when expressed in 


terms of spin vectors y4,, are (complex) linear combinations of bivectors of the form y+, \y_,. Such a 


Ci) 


bivector flips the 2’th bit of a spinor from down to up, and the j’th bit from up to down, preserving the 
total number of up bits of the spinor. Linearly independent generators satisfying these conditions are 


Y2i-1 \Y2j-1 + Yai A Vj = Y+; A Y-; — V4; NYS: (N(N—1)/2 generators) 5 (38.180a) 
Yoi-1 A Vaj — Yai NY25-1 = V4, NY-3 +Y; AV-.) (N(N-1)/2 generators) , (38.180b) 
yoi-1 A Yai = i Yp, A Y-; (N generators) , (38.180c) 


a total of N? generators. The Lie algebra of commutators of the generators (38.180) coincides with the 
Lie algebra of commutators in which Eyy A Ņ-—; is represented by the N x N matrix 1,; with 1 in the 
ij’th entry and 0 elsewhere, 


yy AY- > liz. (38.181) 


But that algebra is just that of the group U(N) of unitary N x N matrices. The generator $ X; Y+; AY-, 
is represented by 2 times the unit matrix, which generates a rotation by an overall phase. Eliminating 
that generator yields the algebra of the group SU(N) of special unitary N x N matrices. Thus U(N) 
and SU(N) are subgroups of Spin(2N) as claimed. The chain (38.179) of subgroups extends (trivially) 
to 


SU(N) c U(N) C Spin(2N) C Spin(2N-+1) . (38.182) 


39 


Super spacetime algebra 


This Chapter presents the super spacetime algebra, the generalization of the 4-dimensional spacetime 
algebra to include spinors. 


39.1 Newman-Penrose formalism 


The extension of the spacetime algebra to spinors is most direct when the basis vectors of the spacetime 
algebra are expressed in a Newman-Penrose basis (Newman and Penrose, 1962). Newman-Penrose adopts a 


tetrad in which two of the tetrad axes are lightlike, y, (outgoing) and Y, (ingoing), while the remaining two 
axes y4, and -y_ are spin axes. 


39.1.1 Newman-Penrose tetrad 


A Newman-Penrose tetrad {Yy, Yu, Y+, Y- } is defined in terms of an orthonormal tetrad {Y0, Y1, Y2, Y3}, (or 
{Vt; Yx, Yy, Yz} if you prefer), by 


Ww = ysl¥o+7s) |, (39.1a 
Yu = Z- Ys) ’ (39.1b 
yp = Z +iy2) |, (39.1¢ 
J- = ya (M1 — in) (39.1d 
or in matrix form 
Ww 1 0 0O 1 Yo 
Yu i) io 0 St yı 
= — , 39.2 
AT vV2| 01 i 0 Y2 oe 
y- 01 —i 0 y3 
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All four tetrad axes are null 
Yo Yo = Yu Yu = Y4: YE =Y- =H. (39.3) 


The tetrad metric of the Newman-Penrose tetrad {y,Yu,¥+,y—} is 


0 -1 0 0 
-1 0 0 0 
whe A 
7 0 001 ee 
0 0 10 
39.1.2 Boost and spin weight 
An object is defined to have boost weight n if it varies by 
ay (39.5) 
under a boost by rapidity 0 along the positive 3-direction. 
Under a boost by rapidity 6 in the 3-direction, the basis vectors Ym transform as (14.44) 
Yo > Yo cosh 0 + y3 sinhé , (39.6a) 
3 > 3 cosh 0 + yo sinhé , (39.6b) 
Yaa (@=1,2). (39.6c) 


It follows that a boost by rapidity 0 in the 3-direction multiplies the outgoing and ingoing axes y, and Yu 
by a blueshift factor e? and its reciprocal, 


Ww > e? Wo, Yu Pe? Yu. (39.7) 


In terms of the boost velocity v = tanh@ (not to be confused with the Newman-Penrose index v), the 
blueshift factor is the special relativistic Doppler shift factor 


e = (; a . (39.8) 


Thus y, has boost weight +1, and +,, has boost weight —1. The spin axes y+ both have boost weight 0. The 
Newman-Penrose components of a tensor inherit their boost weight properties from those of the Newman- 
Penrose basis. The general rule is that the boost weight n of any tensor component is equal to the number 
of v covariant indices minus the number of u covariant indices: 


boost weight n = number of v minus u covariant indices | . (39.9) 


The operation of boosting along the 3-axis, which is the same as a rotation in the yo~y3 plane, commutes 
with the operation of rotating in the yı—y2 plane. The concept of spin weight presented in §38.2 holds 
unchanged. The outgoing and ingoing basis vectors y, and Yu have spin weight zero, while y} and y- have 
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spin weight +1 and —1. The general rule is that the spin weight s of any tensor component equals the number 
of + covariant indices minus the number of — covariant indices (this repeats rule (38.14)): 


spin weight s = number of + minus — covariant indices | . (39.10) 


The boost and spin properties of the components of a tensor are thus manifest in a Newman-Penrose 
tetrad. 


39.2 Chiral representation of y-matrices 


The chiral representation of the Dirac y-matrices provides the natural extension of the Newman-Penrose 
tetrad to spin-4 particles. The chiral representation may be obtained from the Dirac representation (14.102) 
by the transformation (Dirac — chiral) 


X: Ym > ce aan (39.11) 
where X is the symmetric (X = X'), unitary (X7! = XT) matrix 


0 0 


1 
1 0 
X = — 
V| i 


= o» 


1 i 
39.12 
Oi 01 


As in the Dirac representation, all the y-matrices in the chiral representation are traceless; the only basis 
matrix of the algebra with finite trace is the unit matrix. The y-matrices in the chiral representation are the 


unitary matrices 
0 1 0 Oa 
Yo = ( tip ) » Wes ( o, 0 ) : (39.13) 


The bivectors og and Io, and the pseudoscalar J are 


Oa (0) 


o i 1 fea 0 fi 0 
We =e =( 0 ae jeant = Ia =i ( 0 es r=il Lar (39.14) 


The Newman-Penrose basis vectors in the chiral representation are the real matrices 


_ 0 Ov _ 0 Cu _ O of 
Yo = Ou 0 > Yu = -Oy 0 2 Y= T4 0 
where om are the Newman-Penrose Pauli matrices 


(+02) = v3( 4 os m= Tell 09) = v3 0 a) (39.16a) 


T 
II 
a, 
ae 
oP 
Si 
ee 
so 
aan 
es 


w= 
a. = Fa (01 + ion) = v3( o ae 0 = Fa (01 ~ ion) = v3( 1 aie (39.16b) 
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The Newman-Penrose bivectors form 6 real matrices that group into three right-handed bivectors (notation 
Ymn = Ym ^ Vers 


o+ 0 1 —o3 0 a 0 
= 2 = = —] = y = 2 ; al 
Yot a(g ay + (You — Y4-) (% TE a(s $ (39.17) 
and three left-handed bivectors, 


ter = V2 5 ae sm +14) =( 9 sI yesa a : £ (39.18) 


The chiral matrix ys is 


l 1 0 
E ETS 4 “a (39.19) 


By construction, the chiral matrix ys is diagonal in the chiral representation. 


39.3 Basis spinors 
Introduce a tetrad of basis spinors €a, 
Ea = {evt, EU}, EUT, Evy} - (39.20) 


The indices {V}, U}, Ut, V|} signify the transformation properties of the basis spinors: V and U signify boost 
weight +4 and -t, while ¢ and | signify spin weight +4 and —t. The index notation, while non-standard, 
fits naturally with the Newman-Penrose {v, u, +,—} index notation. Under a Lorentz transformation, the 
basis spinors €, are defined to transform in the same way as rotors, 


R: ea > Rea. (39.21) 


In the chiral representation (39.13) the basis spinors €a are the column spinors 


1 0 0 0 
0 1 0 0 

EVT = 0 a EUL = 0 ; EUT = 1 5 EV} = 0 5 (39.22) 
0 0 0 1 


which are Lorentz-transformed by pre-multiplying by rotors expressed in the chiral representation. The 
basis spinors €a in the chiral representation may be obtained from those in the Dirac representation by the 
transformation (Dirac — chiral) 


X: €y > X€q , (39.23) 


where the matrix X is defined by equation (39.12). 
The basis spinors €a are eigenvectors of the chirality operator y5, equation (39.19), with eigenvalues +1. 


— 
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Positive chirality spinors are called right-handed, while negative chirality spinors are called left-handed. The 
first two basis spinors are right-handed, while the last two are left-handed, 


YWevt = Evt, YEU, = EuL, Yeus =—€ut, YW5EvL = —EVL - (39.24) 


Lorentz transformation preserves chirality, as is evident from the block-diagonal form of the even elements 
of the spacetime algebra in the chiral representation, equations (39.14). The right-handed basis spinors ey+ 
and ey, are called right-handed because the boost axis and the spin axis point in the same direction (along 
the 3-direction for ey+, and along the negative 3-direction for ey). Conversely, the left-handed basis spinors 
€y+ and ey, are called left-handed because the boost axis and the spin axis point in opposite directions. 

A Lorentz boost R = e72°/? = cosh(0/2) +03 sinh(0/2) by rapidity 0 along the spin axis (3-axis) multiplies 


the basis spinors €a by e*°/? according to 
€vyt+ 7 ef eyy > EU, > een, > EUT? e~? ey » EV} > efl ev, R (39.25) 
The transformations (39.25) confirm that the basis spinors with a V index have boost weight +4, while 
—Io30/2 = 


the basis spinors with a U index have boost weight —}. A right-handed spatial rotation R = e 
cos(9/2) — Ios sin(0/2) by rotation angle 0 about the spin axis (3-axis) multiplies the basis spinors €a by 
18/2 according to 


e? 


i0/ / i0/ ey. (39.26) 


The transformations (39.26) confirm that the basis spinors with a + index have spin weight +3, while the 


basis spinors with a | index have spin weight —}. This justifies the choice of indices on the basis spinors. 
Spinor tensors inherit their boost and spin weights from those of the basis spinors. The rules are 


Ev+ >e evt ' EU} => a’ "eu ¥ EUT >e Peut i EV} =y a? 


boost weight n = 4 (number of V minus U covariant indices) |, (39.27a) 


spin weight s = 4 (number of ¢ minus | covariant indices) | , (39.27b) 


which generalize the rules (39.9) and (39.10). The rules (39.27) hold not only for column spinors €,, but also 
for row spinors €,-, §39.5.2, and for inner and outer products of spinors, §39.5.3 and §39.6.1. 


39.4 Dirac and Weyl spinors 


A Dirac spinor 7 is a complex (with respect to i) linear combination of the 4 basis spinors €a, 
p= We, . (39.28) 


A Dirac spinor has 4 complex components, making 8 degrees of freedom in all. Just as a multivector ay), 
is a vector in the spacetime algebra, so also we, is a spinor in the super spacetime algebra. 
A Dirac spinor w Lorentz transforms as 


R: boRv. (39.29) 
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A Dirac spinor w is a spin-4 object, in the sense that a rotation by 27 changes the sign of the spinor, and a 
rotation by 47 is required to return the spinor to its original value. 


Concept question 39.1. Lorentz transformation of the phase of a spinor. Should not a Lorentz 
transformation also change the phase of a spinor w as a function of position? For example, if the phase is 


—iwt+tk-® in the Lorentz-transformed frame? 


wy ~ e*™ in the spinor rest frame, would not the phase be Y ~ e 
Answer. No. A Lorentz transformation is a tetrad transformation, not a coordinate transformation. That 
being said, in flat (Minkowski) space it is possible to choose inertial coordinates {t,x} aligned everywhere 


with the tetrad frame. It is true that y ~ e~“!+** ® with respect to Lorentz-transformed inertial coordinates. 


39.4.1 Weyl decomposition of a Dirac spinor 


A Dirac spinor w can be decomposed into a sum of right- and left-handed chiral Weyl spinors wp and wy 


w= Ur+ yt , (39.30) 
that are right- and left-handed eigenvectors of the chiral operator 7s, 
Vp = dp . (39.31) 
L L 


The right- and left-handed chiral spinors can be projected out by applying the chiral projection operators 


z(a + y5) (which are projection operators because their squares are themselves) to the Dirac spinor Y, 


bp = g(t 75)¢ . (39.32) 


Since chirality is Lorentz invariant, the chiral decomposition of a Dirac spinor is unique. The right- and 
left-handed components of a Dirac spinor each contain 2 complex components. The right- and left-handed 
components of a Dirac spinor cannot be rotated into each other by any Lorentz transformation. 


39.5 Spinor scalar product 


39.5.1 Spinor metric tensor 


In a matrix representation, the tensor product of Dirac basis spinors €a and €, can be represented as the 
matrix Ea, 5 a matrix product of the column spinor €a with the row spinor el. In accordance with the 
transformation rule (39.21), the tensor product of basis spinors Lorentz transforms as 


R: eae} > Reach R! . (39.33) 
Consider the spinor metric tensor £ with the defining property that for any Lorentz rotor R 


Rle=eR. (39.34) 
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That £ defines a Lorentz-invariant spinor metric will be seen in §39.5.3. A Lorentz rotor is a real (with 
respect to į) linear combination of even elements 1, I, Ca, and Io, of the spacetime algebra. Consequently, 
in the Dirac representation (14.103) a necessary and sufficient condition for (39.34) to hold is that € anti- 
commutes with J and a2, and commutes with o; and o3. This requires that £ be proportional to yz in the 
Dirac representation, with a proportionality factor that could be some arbitrary complex (with respect to 
i and/or I) number. To be consistent with standard Dirac theory, the spinor metric tensor € in the Dirac 
representation (14.103) is taken to be the real unitary matrix 


(39.35) 


oro o 
| 
On 

oo oO KF 


Despite the equality of € and i7y2 in the Dirac representation, € is defined to transform as a spinor tensor 
under Lorentz transformations, not as an element of the spacetime algebra. The spinor metric (39.35) in the 
Dirac representation translates into the chiral representation (39.13) as Eshira = XT! EpiracX~! = —iloo. 
However, the resulting chiral spinor metric £chira] is imaginary. The chiral spinor metric can be made real by 
scaling it by a factor of i, 


0 1 0 0 
—1 

Echiral = iX—" epiracX * = Io = 0 : i ? (39.36) 
0 0 -1 0 


The normalization is chosen such that ¢ in either the Dirac or chiral representations is real and orthogonal. 
Its square is minus the unit matrix, 


e tse’, et =-1. (39.37) 


In both Dirac and chiral representations, commuting the spinor metric € through the orthonormal basis 
vectors Ym converts them to minus their transposes, 


YmE = —EYm + (39.38) 
The condition (39.34) implies that the spinor tensor € is invariant under Lorentz transformations, 
R: e> R'eR=eRR=e. (39.39) 
The components of the spinor tensor ¢ define the antisymmetric spinor metric Eab, 
ELEEp = Eab - (39.40) 


Notice that the spinor metric tensor €gp is non-vanishing only between like-chiral indices ab. 
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39.5.2 Row basis spinors 


It is convenient to use the symbol €a: with a trailing dot, symbolic of the trailing £, to denote the row spinor 


T 
En, 


Eg = ee: (39.41) 
The motivation for the trailing dot notation is equation (39.45) below. The four row spinors 
Ea: = {€vt-, EUL ' EUT EV} °} (39.42) 


provide a basis for row spinors. The boost and spin weights of the row basis spinors are in accord with their 


covariant indices: basis spinors with a V index have boost weight +4, while basis spinors with a U index 


have boost weight —t. Likewise basis spinors with a ¢ index have spin weight +3, while basis spinors with 


a | index have spin weight —t. The row spinors €a: Lorentz transform as 


R: ea: = ele > el R'e=eleR=ea R. (39.43) 
In the chiral representation (39.13) the row basis spinors €,- are the row spinors 


evs =(0100), eu: =(—1 0 0 0), euy: =(0 001), evj-=(0 0 —1 0). (39.44) 


39.5.3 Inner products of basis spinors 


The product of the row spinor €a: with the column spinor e€, defines their inner product, or scalar product, 
which equals the spinor metric £a, in accordance with equation (39.40), 


CEE] (89.43 


Equation (39.45) motivates the trailing dot notation for the row spinor. The scalar product is antisymmetric, 
Ea © Eb = — 6b ` Eq - (39.46 


In the chiral representation, the non-zero components of the scalar product are explicitly, equation (39.36), 


Eyt ` Eu, = —Eu, eve = 1, Eut: Ev} = —Ey, Eu Hl. (39.47 


The scalar product (39.45) is a Lorentz scalar, 


R: €4:€) > Ea: RRQ = Ea €b . (39.48 


Thus the spinor metric £a» is Lorentz invariant, just like the Minkowski metric nmn- 


39.5.4 Lowering and raising spinor indices 


The antisymmetric spinor metric € ap is given in the chiral representation by equation (39.36). The inverse 
metric €% is defined by e%%epe = 6%. The spinor metric and its inverse satisfy 


Eab = —Epa = -EP = eh , (39.49) 
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Indices on a spinor tensor are lowered and raised by pre-multiplying by the metric £a and its inverse e@?. 


The contravariant components e? of the column basis spinors. satisfying €" = £%ep, are 


EV? = —€U}ļ , eU? =€Evt; et =€Ev|,; evt =—€EUt - (39.50) 


For example, eV? = ev len) = —e€y,. Post-multiplying by the metric or its inverse yields a result of 


opposite sign, €l = e*e, = —e,e°*. The contravariant components e°- of the row basis spinors satisfy the 


same relations (39.50) with a trailing dot appended on left and right hand sides. The scalar products of 
contravariant with covariant basis spinors form the unit matrix, 


E" - €b = —e€p-€° = Ôp . (39.51) 
The scalar products of contravariant basis spinors are 


ve =e et =e” , (39.52) 


39.5.5 Scalar products of Dirac spinors 


A general row Dirac spinor w- is a complex (with respect to i) linear combination of the 4 row basis spinors 


p- = ye = Yea]. (39.53) 


It Lorentz transforms as 
R: wy ow-R. (39.54) 


A row spinor w- transforms like a reverse rotor. 
The scalar product of a row spinor w- = "€a: with a column spinor vy = y*€, may be written variously 


p: x= wlex = Y Ea -X6 = Eabh X? = Y’ Xa = —Vax® = Ee Warr . (39.55) 


Notice that when the scalar product w- x is written in the contracted form w*xq, the first index is raised 
and the second is lowered. An additional minus sign appears if the first index is lowered and the second is 
raised. Flipping the indices on the expansion ~%e, of a spinor in components similarly changes the sign, 


Y = Wea = Weare” = pae" . (39.56) 


The components ° of a column spinor w can be projected out by pre-multiplying by the row basis spinor 


et. 


E yp = E° - We, = dey? =". (39.57) 
The components Y° of a row spinor Y- can be projected out by post-multiplying by minus the column basis 
spinor €ĉ, 


— pe? = pen E€ = Sty? = yy". (39.58) 
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39.6.1 Outer products of basis spinors 


A row spinor €a: multiplied by a column spinor €, yields their scalar product. In the opposite order, a 
column spinor €a multiplied by a row spinor €,- yields their outer product. The outer product €,€,- Lorentz 
transforms like a multivector in the spacetime algebra, 


R: €&€-= eee = Re,e, R'e = Re,eleR = Ree R. (39.59) 


The trailing dot on the outer product €,€,- is symbolic of the trailing £, necessary to convert the spinor 
tensor Ene, into an object that transforms like a multivector. 

The products of the 4 column basis spinors €a with the 4 row basis spinors €p: form 16 outer products. 
All 16 outer products are non-vanishing, and their algebra is isomorphic to the 4D spacetime algebra of 
multivectors. Unlike the spacetime algebra, the outer product contains both antisymmetric and symmetric 
products. 

The 16 outer products divide into 8 outer products of spinors of like chirality (two right, or two left), 
and 8 outer products of spinors of opposite chirality (one right, one left). The outer products of spinors of 
like chirality yield the 8 even-grade elements of the spacetime algebra, while outer products of spinors of 
opposite chirality yield the 8 odd-grade elements of the spacetime algebra. The 8 even elements preserve 
chirality (they transform a spinor of given chirality to another of like chirality), while the 8 odd elements 
flip chirality (they transform a spinor of given chirality to another of opposite chirality). 

In the chiral representation (39.13), the 8 outer products of basis spinors of like chirality map to even 
multivectors of the spacetime algebra as follows. The boost and spin weights of the left and right hand 
sides of each of equations (39.60)—(39.63) below match, as they should. The antisymmetric outer products 
of right-handed spinors form a right-handed scalar singlet, 


levy, evt]: = 3(1 +5) - (39.60) 


The trailing dot on the commutator indicates that the right partner of each product is a row spinor, 
levy, evr]: = EV LEV: EVtEt, - Similarly the antisymmetric outer products of left-handed spinors form a 
left-handed scalar singlet, 


levy, evt]: = 3(1— 7s) - (39.61) 
The symmetric outer products of right-handed spinors form a triplet of right-handed bivectors, 
{evp evt} = w+, {evp ev, = 3(Yu-V4+-), fev, eu} = Mu- - (39.62) 
The symmetric outer products of left-handed spinors form a triplet of left-handed bivectors, 
{eup eut} = Yat, deun ev) =- 3l +74); {evo evii = W- - (39.63) 


In the chiral representation (39.13), the 8 outer products of basis spinors of opposite chirality map to odd 
multivectors of the spacetime algebra as follows. Again, the boost and spin weights of the left and right hand 
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sides of each of equations (39.64)—(39.65) below match, as they should. The 4 symmetric outer products of 
right- with left-handed spinors yield the 4 Newman-Penrose basis vectors, 


{évt, ev} = -zW , {eu eut} = Bru » €vt,eut}: = Y+ » {ev ev} = == . (39.64) 


The 4 antisymmetric outer products of right- with left-handed spinors yield the 4 Newman-Penrose basis 
pseudovectors, 


levt evi] = =w, levi ev = Zw, levn eud = Few» levy, evs] = -57 - 
(39.65) 
The trace of the outer product of a pair of basis spinors gives their scalar product (note that the 1 on the 
right hand sides of equations (39.60) and (39.61) is the unit matrix, whose trace is 4), 


Tr €4 €): = Ep €a = Eba - (39.66) 


The expansion of the 16 outer products €,€,- of spinors in terms of the 16 basis elements ym of the 
spacetime algebra, and vice versa, define the matrix of coefficients yM and its inverse y%, 


Ea€b = yab YM > MS YM EnEn" - (39.67) 
The coefficients yf and 7%? are 
Yas = Fen yea, =E yM. (39.68) 


The coefficients in the chiral representation in terms of Newman-Penrose basis multivectors can be read off 
from equations (39.60)—(39.65), and are all real. 


Exercise 39.2. Consistency of spinor and multivector scalar products. Confirm that the spinor and 
multivector scalar products are consistent. This exercise is similar to Exercise 38.1. 

Solution. Multivector vectors ym are equivalent to outer products of Dirac spinors in accordance with 
equations (39.64) and (39.64). For example, the scalar product of the multivectors Yyy and Yu is 


=Y ` Yu = — F (You + Yur) 
= {evp evi}: {eu Eut}: + {eup Eut}: {Evt evi} 
= eyņ (ey, : €ut)Eu,: + evi (evr euyeut: + eu, (Eur : ev evr: + eur (Evy: evt)evy: 
= — €yr€uy: + Evyeut: + EulEevt: — EUTEV} : 
= lev, evt]: + levy, ev]: 
= (1+ ys) + 4(1-— 15) 
, (39.69) 


= 


the fourth step of which invokes the spinor scalar product (39.47), and the penultimate step is from the 
equivalences (39.60) and (39.61). The result agrees with the multivector scalar product (39.4). 
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Concept question 39.3. Chiral scalar. A scalar field has no spin. How then can a scalar field have 
chirality? Answer. A chiral scalar is a sum of a scalar and a pseudoscalar. For example, a right-handed 
chiral scalar is 


pr =l +y), (39.70) 


where y is a complex scalar. The chiral operator ys is not a scalar, but rather a totally antisymmetric tensor 
of rank 4. The Newman-Penrose expression (39.114) for ys shows that it has zero boost and spin weight. 


39.6.2 The 4D super spacetime algebra 


The super spacetime algebra comprises 4 distinct species of objects: true scalars, column spinors, row spinors, 
and multivectors. In a matrix representation, they are complex (with respect to i) matrices with dimensions 
1x1, 1x4, 4x1, and 4 x 4. The super spacetime algebra consists of arbitrary sums and products of all 4 
species. 

The true scalars are just complex numbers. A column spinor w is a complex linear combination of column 
basis spinors €a, 


Y= prea , (39.71) 
while a row spinor y- is a complex linear combination of row basis spinors €,-, 
w= wreg:. (39.72) 


A multivector a is a complex linear combination of outer products of the column and row basis spinors, 
a = aeacb: . (39.73) 


Linearity and the transformation law (39.59) imply that the algebra of sums and products of outer products 
of spinors is isomorphic to the spacetime algebra. 

As seen in §39.5.3 and §39.6.1, a column spinor ~ and a row spinor x- can be multiplied in either order, 
yielding an inner product which is a true scalar, and an outer product which is a multivector. However, 
a column spinor cannot be multiplied by a column spinor, and likewise a row spinor cannot be multiplied 
by a row spinor, as is manifestly true in a matrix representation. Rather than prohibit multiplication, it is 
advantageous (because it facilitates interpretation of the column and row spinors as creation and annihilation 
operators) to assert that the product of a column spinor with a column spinor is zero, and the product of a 
row spinor with a row spinor is zero, 


wx=0, wx: =0. (39.74) 


Similarly, a multivector a can only pre-multiply a column spinor w, and can only post-multiply a row spinor 
w-, as is again manifestly true in a matrix representation. Thus a multivector a post-multiplying a column 
spinor or pre-multiplying a row spinor yields zero, 


ya=0, a(y-)=0. (39.75) 
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In general, a sequence of products of spinors yields a non-zero result provided that they alternate between 
column spinor and row spinor, 


Px- or Pp xg. (39.76) 

Both product sequences are associative, 
vx: p= (Wx) =Y x: p), (39.77a) 
px =: xp =: xe). (39.77b) 


One of the advantages of the trailing dot notation is that it makes the directionality of spinor multiplication, 
and the corresponding associative law, transparent. A multivector a is equivalent to an outer product of 
spinors, so products such as 


Yax (39.78) 


are admissible, and in general non-vanishing. 
The trace of an outer product of spinors is a true scalar 


Tr px: = Y X’ Eta =Y: X=X Y , (39.79) 


the last step of which assumes that the coefficients Y° and y? are ordinary commuting complex numbers, 
equation (39.120). 


39.6.3 Fierz identities 


FIX The associative law and the scalar product make it straightforward to simplify long strings of products 
of spinors and multivectors, a process known in quantum field theory as Fierz rearrangement. 

Let a = a e,e,- and b = beac: be two multivectors expressed as a sum of outer products of spinors. 
Their product is the multivector 


b 


ab = a” €,.6, - be .€ 4° = Eaa Epb Eg = Eaa bytea. (39.80) 


A sequence of multivectors sandwiched by spinors simplifies as 


yY- aby = Weg a” Epes « bE ge, ` xfer = Yeap a Eca beer xf = Waa be Xe: (39.81) 


39.7 Charge conjugation 


The super spacetime algebra possesses a discrete transformation, called charge conjugation (or simply conju- 
gation, when there is no ambiguity), denoted C, that transforms a particle into its antiparticle (Bjorken and 
Drell, 1964, §5.2). The charge-conjugate Dirac spinor 7 is defined by equation (39.91) below (Bjorken and 
Drell (1964) denote the charge conjugate by Ye). The conjugate spinor w has the defining properties that 
(a) its components are complex conjugates of those of the parent spinor ~, and (b) it Lorentz transforms in 
the same way as w. 


39.7 Charge conjugation 1015 


39.7.1 Conjugation operator C 


Consider the conjugation operator C with the defining property that commutation with it converts any 
Lorentz rotor R in the chiral or Dirac representations to its complex conjugate (with respect to i), 


CR*=RC. (39.82) 


Note that the complex conjugate R* of a Lorentz rotor R is also a Lorentz rotor, since a Lorentz rotor 
R is a real (with respect to i) linear combination of even orthonormal basis multivectors of the spacetime 
algebra. In the Dirac representation (14.103), a necessary and sufficient condition for (39.82) to hold is that C 
commutes with J and o2, and anticommutes with o; and a3. This requires that in the Dirac representation 
C is proportional to o2 with a proportionality factor that could be some arbitrary complex (with respect to 
i and/or I) number. To be consistent with standard Dirac theory, the conjugation operator C is taken to be 
the real unitary matrix 


0 0 0 ~i 
0 01 0 

C=-m=!1 9 1 9 o (39.83) 
-100 0 


Notwithstanding the equality of C and —o> in the Dirac representation, the conjugation operator C is defined 
to transform not as an element of the geometric algebra, but rather as 


R: C> RCR™, (39.84) 


in accordance with the defining condition (39.82). Note that if the Lorentz rotor R were unitary, then RCR-* 
would equal RCR! ; but although spatial rotations are unitary, Lorentz boosts are not. The spinor tensor that 
Lorentz transforms as (39.84) and remains invariant under that transformation is precisely the conjugation 
operator C. 

The conjugation matrix (39.83) in the Dirac representation translates into the chiral representation (39.13) 
as Coniral = X CDirac X *, which happens to be the same matrix as in the Dirac representation, Cehiral = 
Chirac. However, to compensate for the extra factor of i introduced into the definition (39.36) of the chiral 
spinor metric €chiral, it is necessary to introduce an extra factor of —i in the definition of the chiral conjugation 
matrix Chiral, 


0 0 0 2 
; = _ 0 0 —i 0 

Chiral = —iX CpiracX = iCDirac = 0 =i 0 0 (39.85) 
i 0 0 0 


The compatibility of the normalizations of ¢ and C is necessary to ensure that the scalar product 4 - w of a 
spinor with its conjugate is real, equation (39.148). Note that the conjugation matrix (39.85) in the chiral 
representation (39.13) is Ccniral = il Y2, not —02. 

In both Dirac and chiral representations (39.83) and (39.85), the conjugation operator is symmetric and 
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unitary, 
c'=C, cot=cc*=1. (39.86) 


In both Dirac and chiral representations, commuting the conjugation operator C through the orthonormal 
basis vectors ym converts them to their complex (with respect to i) conjugates, 


CAm = nC - (39.87) 


In both Dirac and chiral representations, commuting C through the spinor metric ¢ converts the former to 
minus its complex conjugate, 


Ce =—eC*. (39.88) 


39.7.2 Conjugate spinor 


The complex conjugate ~* of a Dirac spinor w = we, is defined to be the spinor with complex-conjugated 
(with respect to i) coefficients in the Dirac or chiral matrix representation of the spinor, 


Y* =Y Ea . (39.89) 


In effect, the basis spinors €, are taken to be real in the Dirac or chiral representations. The operation (39.89) 
of complex conjugation of a spinor is representation-dependent, as is evident from the fact that the unitary 
matrix X, equation (39.12), that transforms between Dirac and chiral representations is complex. By contrast, 
the conjugation operation (39.91) below is representation-independent. Complex conjugation leaves the boost 
and spin of a spinor unchanged. Since a spinor Y Lorentz transforms under a Lorentz rotor R as Yy > Ry, 
its complex conjugate ~* transforms according to the complex-conjugate representation of the y-matrices, 


R: o* > (Ry)* = Rey" . (39.90) 


The conjugate Dirac spinor ~ is defined by 


yp=Cy" |, (39.91) 


where C is the conjugation operator defined in the Dirac or chiral representations by equations (39.83) 
and (39.85). The conjugation operator C is by construction Lorentz invariant, so the conjugate spinor w 
Lorentz transforms as 


R: b=Cy* > CR*y* = RCY* = Rd, (39.92) 


that is, the conjugate spinor w Lorentz transforms in the same way as the spinor ~. The middle expression 
of equation (39.92) is CR*y* = C(Rw)* = Ry, so 


Ry = Ry, (39.93) 


that is, the operations of conjugation and Lorentz transformation commute. 
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The symmetry of the conjugation operator, C = C', implies that conjugating a Dirac spinor 7 twice 
recovers the original spinor, 
Y = C(Cy*)* =CC*W =CC Ty =y. (39.94 
If Y = Yea, then the conjugate spinor w is 
p = prea , (39.95 
where the conjugate basis spinors €, are defined by 
Ea = Cea . (39.96 


In the Dirac representation the conjugate basis spinors €, are, from the expression (39.83) for C, 


{Ents Ents Eut Evy} = {Eup Eut Em, Ent - (39.97 


In the chiral representation the conjugate basis spinors €, are, from the expression (39.85) for C, 


{€v+, EU}, EUT, Ev} = {iev}, —icuUĵ, —iEU], ev} ; (39.98) 
Equation (39.98) shows that conjugation flips spin, but leaves boost unchanged. 


39.7.3 Row conjugate spinor 


In both Dirac (14.102) and chiral (39.13) representations, the row conjugate spinor y: corresponding to the 
column conjugate spinor 4) is 


p- =qŅ%'e =Y Cle = -ipt yo . (39.99) 
The row conjugate spinor p- is commonly called the adjoint spinor. The row conjugate spinor ~- equals the 
reverse spinor ~ defined by equation (14.119). Note that the column conjugate spinor w is not the same as 


the conventional adjoint spinor ~; rather the conventional adjoint spinor w is the row conjugate spinor w-. 
Equation (39.99) implies that 


C'e = -iy . (39.100) 


Equation (39.100) holds in both Dirac and chiral representations, but in fact it must be true in any satisfactory 
representation of Dirac spinors governed by the Dirac Lagrangian (41.29), in order that the Dirac number 
current density n° = iy -y°w, equation (41.19), equal a positive number yty. 

The spinor metric £ and conjugation operator C may be regarded as being defined by their actions (39.38) 
and (39.87) on the Minkowski basis vectors ym, namely Ņ,, = —€%me~* and yž, = C%mC1. If equa- 
tion (39.100) holds, as it must do, and if in addition C is symmetric, C' = C, as it must be if C is unitary 
and the double conjugate of a spinor is itself, as it is in Dirac representation, then the Hermitian conjugates 
of the basis vectors Ym satisfy 


Yh, = (%n)* = —Ceyme tO! = —(C"e)m(C'e)~! = — (Y0) Ym (Y0) = YI =". (89.101) 


Equation (39.101) shows that Ym are unitary, which is the condition (14.100) originally adopted for the Dirac 
y-matrices. 
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39.7.4 Conjugate multivector 


The complex conjugate (with respect to i) a* of a multivector a = a“ ym is defined to be, in either the 
Dirac or chiral representation (and in either an orthonormal or Newman-Penrose basis), 


a* sa * y5. (39.102) 


Since a Lorentz transforms under a Lorentz rotor R as a — RaR, its complex conjugate a* transforms 
according to the complex conjugate representation of the y-matrices, 


R: a* > (RaR)* = R*aR* . (39.103) 
So defined, complex conjugation is multiplicative over multivectors and spinors, 
(aw)* = a*4*, (39.104) 


and consistent with the spacetime algebra in the sense that the complex conjugate of a multivector that is 
an outer product of spinors is the outer product of the complex conjugate spinors, 


(bx-)* = Yx". (39.105) 


Complex conjugation leaves the boost and spin of a multivector unchanged. 


The conjugate multivector @ (not to be confused with the reverse multivector @) of a multivector a is 
defined to be 


a = Ca* 07} |. (39.106 


The conjugate multivector a Lorentz transforms in the same way as the parent multivector a, 
R: a > CR*a*R*C~' = RCaC!R= Rak. (39.107 


Conjugation is multiplicative over multivectors and spinors, 


aw = ai). (39.108 


The conjugate of a multivector that is an outer product of spinors is minus the outer product of the conjugate 
spinors, 


OX: = —OX-. (39.109) 


The sign comes from the anticommutation of the conjugation operator with the spinor metric tensor, equa- 
tion (39.88). 
If a = a ym, then the conjugate multivector @ is 


a =a "7y , (39.110) 


where the conjugate basis elements ¥,, are, in either an orthonormal or Newman-Penrose basis, and in either 
the Dirac or chiral representations, 


Iu = CuO. (39.111) 
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The conjugates of orthonormal basis elements are equal to themselves, 
Yu =m orthonormal basis multivectors . (39.112) 


But the conjugates of Newman-Penrose basis vectors are not equal to themselves. Just as conjugation flips 
the spin but not boost of a spinor ~, so also conjugation flips the spin but not boost of a multivector a. 
Conjugation flips the spin indices + + — of the Newman-Penrose basis vectors ym, while leaving the boost 
indices v and u unchanged, 


WEW, Y= V4 SY Ten (39.113) 
as can be verified by direct calculation from the matrices (39.15) and (39.85). This is true in general: the 
conjugate of any Newman-Penrose basis multivector ym is obtained by flipping its spin indices + + —. The 
chiral matrix ys expressed in the Newman-Penrose tetrad is 


; T pui 
pail = E At I = — e Au NAT (39.114) 


where the imaginary factor 7 in the definition of ys cancels against the imaginary determinant of the trans- 
formation from Minkowski to Newman-Penrose tetrad, leaving a real factor in the rightmost expression of 
equations (39.114). Conjugation flips the sign of the chiral operator 7s, 


39.7.5 Real multivector 


Conventionally, a multivector a = a“ yy is said to be real if its conjugate is itself (the overbar here denotes 
the conjugate, not the reverse), 


a=a. (39.116) 


In an orthonormal basis, the conjugates of the basis elements are themselves, y y = Ym, and a multivector 
a is then real if and only if the coefficients a™ of its expansion a = a“ yy in the orthonormal basis are real. 

Most classical multivectors are real. For example, derivatives are real, Lorentz rotors are real, the classical 
electromagnetic field is real. 


39.8 Anticommutation of Dirac spinors 


The Dirac spinor Lagrangian (41.2) involves a mass term my - y. The complex conjugate (with respect to i) 
of the Dirac mass term is, Exercise 39.4, 


(mp-v)* = -mp 4. (39.117) 
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Requiring that the Dirac mass term be real, as required for a real Lagrangian, then imposes the condition 
that the scalar product of the Dirac spinors y and w be antisymmetric, 


p- p=- y. (39.118) 
More generally, in the traditional Dirac theory, the scalar product of any two Dirac spinors is antisymmetric, 
p- x=-x 4. (39.119) 


Since the scalar products of the basis Dirac spinors €, are already antisymmetric, the antisymmetric con- 
dition (39.119) in turn imposes the condition that the coefficients ~* and xy? must be ordinary commuting 
complex numbers, 


Px =P . (39.120) 


The spinor scalar product is non-vanishing only between like-chiral components. Since the conjugate of a 
right-handed chiral spinor is left-handed, and vice versa, the scalar product of a pure right- or left-handed 
spinor (a Weyl spinor) with its conjugate is necessarily zero, 


p:p=Ņp-Ip=0 (Weyl). (39.121) 


Thus Weyl spinors are necessarily massless. 

If a Dirac spinor w is decomposed into its right- and left-handed chiral parts Yr and wy, equation (39.30), 
then since conjugation flips chirality, the scalar product is non-vanishing only between like-chiral spinors. 
The scalar and pseudoscalar products of ~ and w are 


p-p=ypu yr +r: pL, Yp: = ilh Yr- yR: Y). (39.122) 
Note that wy is right-handed, and wp is left-handed. 


Exercise 39.4. Complex conjugate of a product of spinors and multivectors. 
1. What is the complex conjugate (with respect to i) of a product y- aw of a row spinor y-, a multivector 
a of grade p, and a column spinor ~? 
2. If a is a real multivector of grade p, is the product ~- aw real or imaginary? 
Solution. 
1. The complex conjugate of x - aw is 


(x: ayp)" =x* -a*b* =¥'Cea"C1 = —X"eCa*C1b = -x ay). (39.123) 


The sign flip in the penultimate expression occurs because commuting the conjugation operator C 
through the spinor metric tensor € converts C to minus its complex conjugate, equation (39.88). An 
alternative, equivalent expression follows from the antisymmetry of the spinor metric, 


(x: ap)" = -x ab = (ap) -x = ya" -x = (-)?(-)PIp - ax. (39.124) 


The first equality is equation (39.123), while the second equality is from the anticommutation of Dirac 
spinors, equation (39.119). The (—)? sign in the final expression comes from commuting a grade-p 
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multivector through the spinor metric, equation (39.38), while the (—)P/?] sign comes from the reversion 
needed to undo the transposition of a grade-p multivector. The overall (—)Ptlp/2] sign is positive for 
scalars, trivectors, and pseudoscalars, negative for vectors and bivectors. 

2. If the multivector a is real, @ = a, then the complex conjugate of Y - ay, is, from equation (39.124), 


(% - aw)* = (P (PPh - ay . (39.125) 


Thus 4% - aw is real for scalars, trivectors, and pseudoscalars, imaginary for vectors and bivectors. 


39.9 Discrete transformations P, T 


Besides conjugation C, the super spacetime algebra contains two other discrete transformations, parity 
inversion P, and time reversal T. Parity and time reversal are improper Lorentz transformations, which 
preserve the Minkowski metric, but which cannot be obtained by any continuous Lorentz transformation 
starting from the identity. Parity and time-reversal are examples of the geometric algebra transformation of 
reflection through an axis, §13.6. 


39.9.1 Parity inversion P 


The parity inversion operation P reverses all the spatial axes, while keeping the time axis unchanged!, 


=0 
P: Ym > Py, Poa I ME’ 12 
cual: { -ym m= 1,2,3. ee) 
Parity reversal transforms a Dirac spinor ~ as 
P: y> PY. (39.127) 


In any representation, the transformation (39.126) requires P to commute with the time axis yo and an- 
ticommute with the spatial axes y,, a = 1,2,3. The only basis element of the spacetime algebra with the 
required (anti)commutation properties is the time vector yọ, so P must equal yo up to a possible scalar 
normalization: 


P=. (39.128) 


If desired, a scalar factor of i could be inserted, P = iyo, so that P? = 1, but the choice of phase factor 

is not essential. Parity flips boost V + U while leaving spin unchanged. This makes some physical sense: 

flipping boost flips the direction of the momentum of the spinor; while spin is a form of angular momentum, 

which is unchanged by parity inversion. Parity flips chirality, the projection of spin along the direction of 

momentum. 

1 Defining parity inversion as a reversal of all spatial axes is convenient when the number of spatial dimensions is odd, as 
here. In general the spatial rotation group splits into two disjoint parts, a proper group connected continuously to the 


identity, and an improper group obtained by a reflection through any one spatial axis and a continuous rotation. Parity 
inversion can be achieved by reflecting through any odd number of spatial axes. 
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39.9.2 Time reversal T 


The time-reversal operation T reverses the time axis, while keeping all the spatial axes unchanged, 


= =0 
T: Ym Tym T= m M= 12 
Mie TA m=1,2,3. (59129) 
Time reversal transforms a Dirac spinor w as 
T: p> Ty. (39.130) 


In any representation, the transformation (39.129) requires T to anticommute with the time axis yo and 
commute with the spatial axes Ya, a = 1,2,3. The only basis element of the spacetime algebra with the 
required (anti)commutation properties is the time pseudovector Jyo, so T must equal that pseudovector up 
to a possible scalar normalization: 


T =I% . (39.131) 


If desired, a scalar factor of —i could be inserted, T = —iI-yo, to ensure that T? = 1 and PT = I (with 
P = iyo), but again the choice of phase factor is not essential. 


39.9.3 PT 


The product PT of the parity and time inversion operators, 


PETSI; (39.132) 
reverses all 4 spacetime axes Ym, 
PT : Ym > IYmI t = -Ym , (39.133) 
and transforms a Dirac spinor w as 
PT: 4y —> Iy. (39.134) 


The fact that the PT operator equals the pseudoscalar J makes physical sense. The operation of reversing 
all axes, both space and time, is Lorentz invariant. The only Lorentz-invariant basis multivectors of the 
spacetime algebra are the unit matrix and the pseudoscalar. The pseudoscalar is related to the chiral matrix 
by I = 74s, so the basis spinors €a in the chiral representation are PT-eigenstates. 


39.10 The super geometric algebra in arbitrarily many space and time 
dimensions 
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Exercise 39.5. Generalize the super spacetime algebra to an arbitrary number of space and 
time dimensions. Generalize the super spacetime algebra to an arbitrary number of dimensions, with K 
spatial dimensions, and M timelike dimensions, and a total of K+M = N dimensions. This is a generalization 
of Exercise 38.3. 

Solution. The construction described in Exercise 38.3, in which all dimensions are spatial, carries through 
unchanged through parts 1-13. After the construction is completed, modify the matrix representing any 
timelike orthonormal basis vector %m by multiplying the matrix by i (or —i, if preferred), 


Ym > iym for timelike orthonormal basis vectors Ym . (39.135) 


Propagate that modification through the basis orthonormal multivectors of the spacetime algebra. The spinor 
metric £ can be left unchanged, so that it remains real. 

As an example of this algorithm, the spin basis vectors y+, for i = 1...[N/2] continue to be defined in 
terms of orthonormal vectors 7, by the unchanged equations (38.82), yt; = a (V2 1 £72;). The chiral 
construction in part 5 of Exercise 38.3 yields unchanged real matrix representations of all spin basis vectors 
y+,- If in fact y2; (say) is timelike, then replacing y2; > —iy2; (after the construction is completed) means 
that yi, = JMi- + y2;) is really a sum of spacelike and timelike vectors, like the null vectors yy and Yu 
in the Newman-Penrose formalism. 

A super spacetime algebra with both space and time dimensions differs from an algebra with only space (or 
only time) dimensions in that rotations in a time-space plane are non-compact, whereas rotations in a space- 


space (or time-time) plane are compact. Rotations in a time-space plane are called (Lorentz) boosts. For 
example, if one of y2;_1 and Y2; is timelike and the other spacelike, then a rotation by boost angle (rapidity) 
0 in the ~-y2;_1—Yo; plane transforms the the i'th pair of spin basis vectors y+, as, in place of (38.83), 


Ye Se yh, , (39.136) 


and a basis spinor €, transforms as, in place of (38.86), 


0/2 


er. > e?/2 Et’ E DE E 39.137 
ti ti Li Li 


The chiral representation (39.13) of Dirac y-matrices is equivalent to the chiral construction in part 5 of 
Exercise 38.3 with the following rearrangement of indices: 


{¥1; V2; Y3; Yo }Dirac = {V3;, Y4, V1, iV2} - (39.138) 


9. Parity and time reversal. Parity reversal is the operation of reflecting an odd number of spatial axes. 
Time reversal is the operation of reflecting an odd number of time axes. A reflection of an even number 
of spatial axes can be accomplished by a continous rotation in spatial dimensions, while a reflection of 
an even number of time axes can be accomplished by a continuous rotation in time dimensions. 

If the total number N = K+M of spacetime dimensions is even, then parity reversal may be accom- 
plished by setting the parity operator P equal to one of the space dimensions if K is even, or to one of 
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the time dimensions if K is odd, and transforming spinors and multivectors by 
P: y> Py, a> PaP. (39.139) 


If desired, a phase factor can be inserted into P to ensure that P? = 1, but the choice of phase factor 
is not essential. Again, if the total number N = K+M of spacetime dimensions is even, then the 
combined operation of parity and time reversal may be accomplished by setting the PT operator equal 
to the pseudoscalar Iy, 


PT: Inv, a-iIyaly'. (39.140) 


Time reversal is accomplished by the operator T = P(PT) = Ply. 

As in part 9 of Exercise 38.3, if the total number N = K+ M of spacetime dimensions is odd, then 
there is no element of the geometric algebra that accomplishes parity or time reversal by operations 
like (39.139) and (39.140). The implementation of parity and time reversal in odd N dimensions is 
described in the next part. 

Super spacetime algebra in odd dimensions, version 2. As described in parts 7 and 10 of Exer- 
cise 38.3, there are two ways to construct the super geometric algebra in odd N = K+ M dimensions, 
the first being to project the algebra into one dimension lower, the second to embed the algebra in one 
dimension higher, and to treat either the final (odd) dimension yy or the extra (even) dimension yy +1 
as a scalar. The vectors yy or Yv+1 have the usual property of anticommuting with all orthonormal 
vectors Ym, other than themselves. If the number K of time dimensions is odd, then the scalar axis yy 
or Yn +1 Serves as a time-reversal operator T, while if the number of time dimensions is even, then the 
scalar axis serves as a parity-reversal operator P. If the number of time dimensions is odd, a suitable 
parity operator is P = y,T, where Ya is any spatial vector; while if the number of time dimensions is 
even, a suitable time-reversal operator is T = yP where yẹ is any time vector. 

Conjugation. Part 14 of Exercise 38.3 mostly carries through, but the condition that the Lorentz- 
invariant conjugation operator C commute with all real orthonormal bivectors, and anticommute with 
all imaginary orthonormal bivectors translates into the condition that, in place of expression (38.148), 
C equals, modulo a normalization factor, the product of the spinor metric tensor £ (or the alternative 
spinor metric tensor Eai) with the product of all timelike orthonormal basis vectors, 


C=eD', T= | [(-i%m) (timelike) . (39.141) 


The normalization of I is such that the eigenvalues of I are real, which ensures that w - w is real, 
equation (39.146). The square of T is one, T? = 1. The eigenvalues of I are +1, and there are equal 
numbers of +1 and —1 eigenvalues, since the trace of T is zero. For example, if there is just one time 
dimension ‘yo, as in the 4D spacetime algebra considered in this Chapter, then T is, equation (39.100), 


T = iy . (39.142) 


Notwithstanding equation (39.141), the conjugation operator C is defined to transform not as an element 
of the geometric algebra, but rather as a spinor tensor that is invariant under Lorentz transformations. 
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Table 39.1: Symmetry of the conjugation operator C 
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Conjugation flips all space-space and time-time bits of a spinor, while keeping all space-time bits un- 
flipped. The chirality of a spinor is its sign under the chiral operator xy. For even K—M, conjugation 
flips the chirality of a spinor if (K—M)/2 is odd, and leaves the chirality unchanged if (K—M)/2 is 
even. For odd K—M, if the path proposed in part 7 is followed, where the odd-N algebra is projected 
into one lower dimension, which requires identifying xy with unity, then chirality is not a rotationally 
invariant property of spinors. If on the other hand the path proposed in part 10 is followed, where the 
odd-N algebra is projected into one higher dimension, then chirality is the sign under xy + . 


The double conjugate of a spinor is 


p = CO* = + , (39.143) 


where the sign is + or — depending on whether the conjugation operator is symmetric, C = C', or 
antisymmetric (the symmetry condition C = C! is equivalent to CC* = 1 in view of the unitarity of 
C, equation (39.86)). Table 39.1 shows the symmetry of conjugation operator C for the standard and 
alternative spinor metrics, including the tilde’d versions (38.92) for odd K—M. Table 39.1 is essentially 
identical to the earlier Table 38.1, except that the number N of spatial dimensions is changed to the 
difference K—M of numbers of space and time dimensions. For Dirac spinors in 3+1 dimensions, the 
conventional choice is the standard spinor metric (38.90), which ensures that the conjugation operator 
is symmetric, hence that the double conjugate of a spinor is itself, Y =w. 


The scalar product of a conjugate spinor 7 with a spinor y is (compare equation (39.99)) 
p- x=4Cex =4Tx, (39.144) 


which is a complex (with respect to i) number. In particular, in a basis with respect to which T is 
diagonal, the scalar product of a conjugate basis spinor €, with a basis spinor €, is plus or minus the 
Kronecker delta, 


Ea €b = dan , (39.145) 
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the sign being that of the eigenvalue of I. The scalar product of a spinor w with its conjugate is 
p- y =y Th, (39.146) 


which is real given that the eigenvalues of T are real. In zero time dimensions the scalar product of a 
spinor with its conjugate was always positive, equation (38.70), but with one or more time dimensions 
the scalar product of a spinor with its conjugate can be either positive or negative. 

The scalar product 7 - Ty is 


Tx=4fx. (39.147) 


Sı 


In particular, w- Tw is real and positive, 


p-Ty=ytyp. (39.148) 


The conjugate 7, of a basis multivector is defined by equation (38.163). The conjugate of an or- 
thonormal basis vector Ym is, in place of equation (38.164), 


Ym m +(—) Ym G (39.149) 


where the + sign is as given in Table 38.2. For the (3+1)-dimensional Dirac algebra, the + sign is —, 
and M = 1, so ¥,, = Ym, in agreement with equation (39.112). 

Real subalgebra. As in part 15 of Exercise 38.3, a real subalgebra of the complex geometric algebra 
may be obtained by restricting to multivectors satisfying the reality condition that they are their own 
conjugates, 


ā=a. (39.150) 


Conjugates of orthonormal basis vectors are plus or minus themselves per equation (39.149). If the 
overall sign +(—)™ in equation (39.149) is +, as it is for example in the (3+1)-dimensional Dirac 
algebra, then the real subalgebra consists of real linear combinations of orthonormal basis multivectors. 
If the sign +(—)™” in equation (39.149) is —, then the real subalgebra consists of linear combinations 
of odd-grade orthonormal multivectors with pure imaginary coefficients and even-grade orthonormal 


multivectors with pure real coefficients. 
Part 15 of Exercise 38.3 showed that a real super spacetime subalgebra could be obtained as the 
algebra of outer products of self-conjugate spinors, 


p=, (39.151) 


which worked provided that the conjugation operator is symmetric. If there are time as well as space di- 
mensions, then the algebra of outer products of self-conjugate spinors is real, satisfying condition (39.150), 
only if both the spinor metric € and the conjugation operator C are symmetric, that is, the sign is + in 
both Tables 38.1 and 39.1. This is not true for example in the (3+1)-dimensional Dirac algebra, where 
the spinor metric is antisymmetric. Suppose that y = q and y = x are self-conjugate spinors. The 
conjugate of their multivector outer product a = Yx- satisfies 


a = bx: = C(x) OT = 4£(Ce*)x' Cle = £(CY*)(Cx*)'- = 40K: = tex: =+a, (39.152) 
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the + sign at the third step coming from commuting the conjugation operator through the spinor 
metric. Given that the conjugation operator must be symmetric for spinors to be self-conjugate, so 
C=C' =Te!, it follows that 


OsC =VPe'ePe’ =i e" =e’ =He, (39.153) 


the + sign being the symmetry of the spinor metric. The sign is positive, yielding a real geometric 
algebra satisfying equation (39.150), only if the spinor metric is symmetric. 
Transformations that leave the spinor scalar product unchanged. The first half of part 16 of 
Exercise 38.3 carries through. The list (38.171) of grades of multivectors that generate transformations 
that preserve the spinor scalar product remains unchanged. 

But the condition for the scalar product of spinors and conjugate spinors to be preserved under a 
transformation R = e~°7%4/? generated by a grade-p multivector y4 is modified. The commutation 
rule (38.173) is modified to 


CR = CeO YAN? = e7% FAG = oe EPO) | (39.154) 


where the + sign in (+)?, from equation (39.149), is as given in Table 38.2. A scalar product w- y of a 
conjugate spinor with a spinor transforms under R to, in place of equation (38.174), 


(C(Rw)*) . Ry) E pe- EOAR = pee (PMP 0" 74/2 Ry (39.155) 


where the sign (—)[?/?](—)™ in the final expression is the product of (+)? (—)™? and the sign (+)?(—)!?/?1 
in the commutation rule (38.100) of a multivector y4 through the spinor metric e. The spinor product 
is preserved provided that e~(-)?/"(-)"?6"7r4/2 = R-1, which is to say provided that 


— (—)b/2](_)Mpg* = 6 . (39.156) 


If the number of time dimensions is M = 1, or more generally if the number M of time dimensions 
is odd, then the scalar product of spinors and conjugate spinors is preserved under transformations 
generated by multivectors of grade p provided that the coefficient 6 satisfies 


0 real grades (1 or 2) mod 4 (thus 1,2,5,6,...) , (39.157a) 
0 imaginary grades (0 or 3) mod 4 (thus 0,3,4,7,...) . (39.157b) 


If the number M of time dimensions is even, then (—)“? = 1, and the earlier condition (38.176) holds. 
Rotor group. The rotor group is generated by the basis of orthonormal bivectors. Bivectors that are 
the wedge product of a timelike vector and a spacelike vector are multiplied by 7, so that rotations 
9/2 rather than being rotations by a phase, e~‘?/?. 
The orthonormal basis bivectors remain traceless and unitary, but whereas time-time and space-space 


in a time-space plane take the exponential form e 


bivectors remain skew-Hermitian, the time-space bivectors become Hermitian. The rotor group in K 
spatial dimensions and M time dimensions is called Spin(K, M). The construction (38.109) described 
in Exercise 38.3 embeds Spin(K, M) as a subgroup of the group SL(2!/2!,C), where N = K+M, of 
complex 210/2] x 21N/2] matrices of unit determinant, 


Spin(K,M) ¢ SLQI7IC), N=SK+M. (39.158) 
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The group is not unitary, since bivector generators that are wedge products of a timelike vector and a 
spacelike vector are Hermitian, whereas unitarity requires all generators to be skew-Hermitian (compare 
Exercise 14.17). Switching time and space dimensions leaves the group unchanged, so Spin(K, M) is 
isomorphic to Spin(M, K). 

Grade-preserving subgroup of Spin(K, M). As in part 18 of Exercise 38.3, there exists a subgroup of 
Spin(K, M) that preserves the grade (number of up bits) of the spinor. The construction (38.180) runs 
into an obstacle because mixed space-time bivectors cannot be combined in real linear combinations with 
space-space or time-time bivectors to form bivectors of zero spin (complex linear combinations yes, but 
not real linear combinations). The best that can be done is to minimize the number of mixed space-time 
bivectors, by grouping spatial dimensions into pairs, and time dimensions into pairs, leaving at most 
one pair of dimensions a mixed combination of a space and a time dimension. The mixed pair is needed 
only if both space and time dimensions K and M are odd. The construction (38.180) then yields [K/2]? 
skew-Hermitian space-space generators, [M /2]? skew-Hermitian time-time generators, and 2[K/2][M/2| 
Hermitian space-time generators. If there is a mixed space-time pair of dimensions, then there is 1 
extra Hermitian space-time generator. Altogether the grade-preserving subgroup of Spin(K, M) has 
dimension [(K+M])/2]? if at most one of K or M is odd, or ([K/2] + [M/2])? + 1 if K and M are both 
odd. The largest unitary subgroup of Spin(K, M) is the direct product U([A/2]) x U([M/2]) of unitary 
groups generated by the [K/2]? skew-Hermitian space-space generators and the [M/2]? skew-Hermitian 
time-time generators. 
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Geometric Differentiation and Integration 
of Spinors 


40.1 Covariant derivative of a spinor 


A Lorentz transformation of a Dirac spinor ~ by rotor R transforms the spinor by 7 —> Ry. An infinitesimal 
Lorentz transformation R = 1 + «eT /2 generated by a bivector T transforms Y —> q+ teTy. Consequently 
the action of the connection operator I, on a spinor w is 


Pb = 30 ny , (40.1) 
where T, is the N-tuple of bivectors (15.9). The covariant derivative of a spinor w is thus 


In equation (40.2), as previously in equations (15.6) and (15.15), for a spinor Y = €a, the directed 
derivative ð, is to be interpreted as acting only on the components Y° of the spinor, nY = €a On". In the 
convention (39.75) that multivectors acting to the right of a column spinor yield zero, the connection term 
in equation (40.2) can be written as a commutator, in the same form as (15.15), 


Acting on a spinor w, the Riemann curvature operator Rai, equation (15.21), yields another spinor, 
Raw = Ray . (40.4) 


Again in the convention (39.75) that multivectors acting to the right of a column spinor yield zero, equa- 
tion (40.4) can be written in the same form as equation (15.23), 


Raat = $[Rut. ap] . (40.5) 


40.1.1 Covariant derivative of a row spinor 


A row Dirac spinor ~- Lorentz transforms as ~- > w- R, so an infinitesimal Lorentz transformation R = 
1— eT /2 generated by a bivector I transforms Y- > y- — sew -T. Consequently the action of the connection 
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operator [’,, on a row spinor w- is 


and the covariant derivative of a row spinor w- is then 
Dy: = ôn: — iY- Tn (40.7) 


Again in the convention (39.75) that multivectors acting to the left of a row spinor yield zero, the connection 
term in equation (40.7) can be written as a commutator, in the same form as (15.15), 


Dry: = nY: F IEn y] a (40.8) 


Acting on a row spinor w-, the Riemann curvature operator Ru, equation (15.21), yields another row 
spinor, 


Raut): = -iy - Ry . (40.9) 


Again in the convention (39.75) that multivectors acting to the left of a row spinor yield zero, equation (40.9) 
can be written in the same form as equation (15.23), 


Ru: = Rea, Y] . (40.10) 


Equations (15.15), (40.3), and (40.8) show that if a is any element of the super geometric algebra, either 
a multivector or a column or row spinor, or a true scalar, its covariant derivative D,a can be written in the 
same form 


Dna =O,a+5(En, a] - (40.11) 


Likewise the action of the Riemann curvature operator Rai, equation (15.21), on any element a of the super 
geometric algebra takes the same form 


Rea = IIR, al 4 (40.12) 


40.2 Covariant derivative in a spinor basis 


The covariant derivative D» can also be expressed in a spinor basis. 
The spinor tetrad connections °°, are defined, analogously to the definition (11.37) of the tetrad connec- 
k n, to be the coefficients of the change of the spinor axes €a parallel-transported along the direction 


mn? 


tions [ 
Yn, 
T? € = ÔnEa - (40.13) 


The same equation (40.13) with a trailing dot appended on both sides holds for row spinors. The constancy 
of the spinor metric, 


0 = OnEab = On (Ea ` €b) = TG, €a° Ec +16, €c * Eb = VbnEac ton Ecb = Tabn — Loan , (40.14) 
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along with the antisymmetry of the spinor metric, implies that the spinor tetrad connection Pabn is symmetric 
in its first two indices, 


Pabn = Poan . (40.15) 


The symmetry of the spinor tetrad connection is analogous to the antisymmetry of the tetrad connection, 
equation (11.47). The preservation of chirality under parallel transport implies that the spinor connections 
Tabn are non-vanishing only when a and b have the same chirality, 


Tabn = 0 for a, b of opposite chirality . (40.16) 


In the 4D super spacetime algebra, the non-vanishing spinor connection coefficients comprise 12 right- 
handed spinor connections, and 12 left-handed spinor connections. The 24 spinor connection coefficients 
Tabn are related to the 24 tetrad connection coefficients lmn by 


Tabn = EPT kmn ; (40.17) 


where y*?" is the matrix defined by equation (39.68) with km running over the 6 bivector indices, and km 
are implicitly summed over distinct bivector indices. In the chiral representation, the matrix coefficients are 
given by equations (39.62) and (39.63). 

The connection I, defined by equation (15.9) is in terms of the spinor basis 


In = Tane., (40.18) 


implicitly summed over distinct symmetric self-chiral pairs ab of spinor indices. Expressions (15.15), (40.3), 
and (40.8) for the covariant derivatives of multivectors and spinors remain valid with the connection T, 
given by equation (40.18). 


40.3 Covariant spacetime derivative of a spinor 


Acting on a Dirac column spinor 4%, the covariant spacetime derivative D = y” Dp yields another Dirac 
spinor 


Dy column spinor . (40.19) 


This derivative is a fundamental ingredient in the Lagrangian for a Dirac field, and in the resulting Dirac 
equations of motion. 

The covariant spacetime derivative of a row spinor w- is defined to equal the row spinor corresponding to 
the covariant spacetime derivative of the column spinor ~, that is, Dy- = (Dy)-. The following manipulation 
shows that the spacetime derivative of the row spinor is minus the spacetime derivative acting on the row 
spinor to the left: 


Dy). = (Dy). = (Dy) "e = D'e = -4'e D = -4 - D . (40.20) 


The penultimate step is true because y"'e = —ey”, equation (39.38). 
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40.4 Gauss’ theorem for spinors 


In practical applications to spinor Lagrangians, Gauss’ theorem occurs in the form 


Jo -Dy+y- Dx) d'z = fia Yn) dr = fx -mY Bx” , (40.21) 


where w and y are spinors, and D is the torsion-full covariant spacetime derivative. 
Equation (40.21) is proved as follows: 


X: Dy +4- Dx =x: Dy- (Dx): Y 
= x: Y” Dny — (y" Dax) -¥ 
=x: Y” Dn + (Dnx) 4 
=X: y” (Dn + 5Kn)y F (D E $Kn)x) yy 
=X: "Dn T (Dax) -yY + i X Y” Kay — 5 X: Knyy 
= Da (x eq”): (Day + HK”) Y 


= Daleet), (40.22) 


where K, = 4 Kin yE Ay! is the contortion, equation (15.47). The sign flip on the first line comes from 
the anticommutation of Dirac spinors, equation (39.119). The sign flip is cancelled on the third line from 
commuting the basis vectors y” through the spinor metric, equation (39.38). The last term on the penultimate 


line of equation (40.22) vanishes because the torsion-full covariant derivative of the basis vectors y” vanishes, 
Dny" + Ea il = Daq” = 0. 


Al 
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As expounded in Chapter 15, dx denotes the invariant scalar 4-volume, equation (15.102), not the pseu- 
doscalar 4-volume. The units are c = A = 1. 

The relation between energy-momenta pn and spacetime derivatives ô, adopted here is the standard 
quantum mechanics convention, 


Pn = —ihdOn . (41.1) 


Beware that this convention is opposite to the standard cosmological convention, §26.8.2, adopted in Chap- 
ters 26-37. 


41.1 Dirac spinor field 


41.1.1 Dirac Lagrangian 


The general relativistic scalar Lagrangian L of a free Dirac spinor field ~ of mass m is 
L=%:(D+m)y. (41.2) 


Here D = y"D,, is the (torsion-full, in general!) covariant spacetime derivative, equation (15.31), and w 
is the conjugate field defined by equation (39.91). In flat (Minkowski) space the justification for the Dirac 
Lagrangian (41.2) is that it leads to equations that reproduce ample experiment. Equation (41.2) is the 
covariant generalization of the flat space Lagrangian of a Dirac field. If units are restored, then the mass is 
m/(hc). The spinor field has units of length~*/?. 

As it stands, the Lagrangian (41.2) is strangely asymmetric in the fields, as it depends only on the velocity 
Dy of the field, not on the velocity Dw of the conjugate field. Moreover the Lagrangian (41.2) is complex, 
not real. Symmetry and reality can be restored by symmetrizing the Lagrangian (41.2) with its complex 
conjugate. The covariant spacetime derivative D = yn D” has real coefficients D” in an orthonormal basis 


1 Gauge fields such as electromagnetism are necessarily defined in terms of torsion-free derivatives, §16.5, but spinor fields 
contribute to, and experience, torsion, Exercises 16.5 and 16.7. 
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~n. For any multivector a whose coefficients are real in an orthonormal basis, the complex conjugate of y -a 
is, equation (39.123), 


(b- ay)" =Y- a. (41.3) 
The symmetrized, real Lagrangian is thus 
L= $p. (D+m)y- ip: (D+m)y. (41.4) 


Despite being asymmetric and complex, the original Dirac Lagrangian (41.2) does yield the correct Dirac 
equations because the imaginary part of the Lagrangian integrates to a surface term, by Gauss’ theo- 
rem (40.21), 


i | tm - Du) d'z = } [E Dy +y: Deas = TIOS LG j (41.5) 


and therefore has no effect on the equations of motion. 
The original complex Dirac Lagrangian (41.2) is in (super-)Hamiltonian form p- Dq— H with coordinates 
q= Y, momenta p = Y, and (super-)Hamiltonian 


H = -my4 . (41.6) 


Varying the action with complex Dirac Lagrangian (41.2) with respect to the field ~ and its conjugate 
momentum 4) yields, with the help of Gauss’ theorem (40.21) to integrate 6(Dw) = D(dw) by parts, 


6S = fi nd dba” + J [ob - (Dy + my) + (Dy + my) - bv] d'z. (41.7) 

The resulting Hamilton equations of motion are the Dirac equations 
(D +m =0, (41.8a 
(D+m)p=0. (41.8b 


In flat (Minkowski) space, the solutions of the free Dirac equations (41.8) are plane waves. The solutions 
are most straightforward to obtain in the rest frame, where the spinor w is one of the Dirac basis spinors w 
or Yy (with spin either up f or down |), equations (14.108), and the covariant derivative reduces to the time 
derivative D — lð. The conjugate of a spinor is y = Cy*, equation (39.91), and the expression (39.83 
for C says that conjugation flips the Yy and wy states. The Dirac equations (41.8) in the rest frame become 


(—i09 +m), =0, (209+ m)vy =0, (41.9a 
(—i09 + m)yy =0, (io +m) =0, (41.9b 
whose solutions are 
Up x em ah eee (41.10a) 
py x gs Wy x emt, (41.10b) 


While the solution for Yp has positive mass m, the solution for py appears to have negative mass m. This 
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Figure 41.1 Feynman diagram illustrating the Stueckelberg-Feynman interpretation of antiparticles as negative mass 
particles moving backwards in time (Stueckelberg, 1941; Feynman, 1949). The diagram shows an electron e and 
positron é annihilating into two photons (conservation of energy-momentum prohibits annihilation into one photon). 
The arrows represent the direction of charge. 


is Dirac’s celebrated problem of negative mass states (Bjorken and Drell, 1964). On the other hand, the 
complex conjugate yi of the negative mass state wy has positive mass. 

The idea that the negative mass states are antiparticles dates to Stueckelberg (1941), who proposed that 
an antiparticle is a negative mass particle moving backwards in time, as illustrated in Figure 41.1. 

The problem of negative mass states ultimately finds its solution in quantum field theory, Chapter ??, 
which allows particles to be created and destroyed. FIX: CHECK Positive-energy solutions are associated 
with operators that destroy particles, while negative-energy solutions are associated with operators that 
create particles. 


41.1.2 Dirac (super-)Hamiltonian 


Although the Dirac Lagrangian (41.2) yields the correct equations of motion (41.8) (and the symmetrized 
Lagrangian (41.4) yields the same equations), it is not altogether satisfactory. The problem is that the 
Lagrangians (41.2) or (41.4) assume a priori that the momentum conjugate to 7 is its conjugate ~. In 
a “correct” Hamiltonian approach, the coordinates and momenta are independent fields, and any relation 
between them should emerge as an equation of motion. 

The solution to the problem is to introduce a momentum 7 conjugate to the field Y, with no a priori 
relation between m and 7, and to treat the fields 7 and a and their conjugates y and 7 as 4 independent 
fields. In terms of the 4 fields, the Dirac Lagrangian, symmetrized with its complex conjugate so as to make 
it real, is 


L=i37-Dyp—37-Dy-H, (41.11) 


with a (super-)Hamiltonian H that resembles the Hamiltonian of a simple harmonic oscillator, 


H=—-jm(n-*#+-¥y) |]. (41.12) 
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The momentum conjugate to w is ir, while the momentum conjugate to 7 is —4n. The Dirac Hamil- 
tonian (41.12) is consistent with, though does not follow uniquely from, the original Hamiltonian (41.6). The 
justification for the Hamiltonian (41.12) is that it yields the correct Dirac equations of motion, along with 
T = and 7 = w as constraint equations, equations (41.14). 

Varying the Dirac action with Lagrangian (41.11) with respect to the coordinates w and 7% and their 


conjugate momenta m and —7 yields 


6S =+ f (1nd — T: yn ðY) da” 


+4 / [om - (Dw + mit) — ôT - (Dy + mm) + (Dr + my) - ôy — (D7 + my) - ôy] d'z . (41.13) 

The resulting Hamilton’s equations can be written 
(D+m)(F+H)=0, (D+m)(r+¥)=0, (41.14a) 
(D-m)\(ī-4)=0, (D-m)\(r-y4)=0. (41.14b) 


Hamilton’s equations (41.14) appear to describe solutions with both signs of mass m. If the standard choices 
m = w and 7 = vy are imposed initially, then the —m Dirac equations (41.14b) ensure that m = w and 
T = w thereafter. The +m Dirac equations (41.14a) then reproduce the usual Dirac equations (41.8). The 
conditions m = w and 7 = w thus emerge as constraint equations. The original Hamiltonian (41.6) can be 
interpreted as an effective Hamiltonian, valid after the solution 7 =~ and 7 = w to the equation of motion 
is imposed. 


41.1.3 Conserved Dirac number current 


The Dirac Lagrangians (41.2), (41.4), or (41.11) are unchanged if the field and its conjugate are changed by 
opposing complex phases, y > ety and Y% > ey, and likewise the conjugate momenta are changed as 
nm — en and T — e~**r. In infinitesimal form, this transformation is 


p= yp- iq, pyi, nontien, ToOR-ier. (41.15) 
The corresponding conserved Noether current, equation (16.17), is 
n™ = hi (n. yp HT: y). (41.16) 


The relative sign of the two terms on the right hand side of equation (41.16) is positive because the fields 
vary with opposite sign under the transformation (41.15), 5a = —dw. Imposing the positive mass conditions 
a = and 7 = y brings the Noether current. to 


n™ = bi (p ypt yy). (41.17) 
The two terms on the right hand side of equation (41.17) are the same since 


Yp- yp =- babyy, (41.18) 
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so equation (41.17) simplifies to 
nm” =ip-yy. (41.19) 


The Dirac current (41.19) is covariantly conserved in accordance with Noether’s theorem, equation (16.18), 


o 


Dmn™ =0. (41.20) 
The factor i in the Dirac current (41.19) is introduced so that the time component n° is a positive number, 
n? = ro , (41.21) 


where, in accordance with equation (39.99), Yt = y*! is the Hermitian conjugate of y. The Dirac current 
n™ is interpreted as a conserved probability number current with a positive density n°. 
If the current (41.19) is written 


N = Ym” = iym Y yd , (41.22) 
then the probability conservation equation (41.20) is 
D-n=0. (41.23) 


If the Dirac spinor is null, 7). = w- Iy = 0, then the free Dirac equations preserve chirality. In this case 
the right- and left-handed components of the current n are separately conserved. It follows that, for a free 
null spinor in the absence of interactions, the pseudovector current 


ng Sid-ysyy (41.24) 


is also conserved. 


41.2 Dirac field with electromagnetism 


Electromagnetism emerges from the hypothesis that the Lagrangian is invariant under a symmetry that 
rotates the Dirac field Yy by a complex phase proportional to the electric charge e of the field. This kind 
of transformation is called a gauge transformation. The three forces of the Standard Model, §42.1, the 
electromagnetic, weak, and strong forces, all emerge from gauge transformations. Electromagnetism is the 
simplest gauge field, based on the 1-dimensional unitary group Uem(1) of rotations about a circle. 

Under an electromagnetic gauge transformation, a Dirac field 7 of charge e, and its conjugate field 7, 
which is proportional to the complex conjugate of the field, equation (39.91), and likewise their conjugate 
momenta 7 and 7, transform as 


p> ey, woe db, cod! r, roe a, (41.25) 


where the phase 6(z) is some arbitrary function of spacetime. The charge e is dimensionless, and the charge —e 
of the conjugate field p must be minus that of the field. To ensure that the Dirac Lagrangian (41.4) remains 
invariant under the gauge transformation, the derivative D must be replaced by a gauge-covariant derivative 
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D+ieA which, when acting on the field and its conjugate, transforms under the gauge transformation (41.25) 
as 


(D + ie A) > e*?(D+ieA)y, (D—ieA)p > e°(D—ieA)y . (41.26) 


The conjugate momenta m and 7 transform respectively as q and y. The gauge-covariant derivative trans- 
forms correctly provided that the gauge field A transforms under the electromagnetic gauge transforma- 
tion (41.25) as 


A+ A+D0. (41.27) 


The gauge field A is the electromagnetic potential. 

The general relativistic scalar Lagrangian L of a Dirac spinor field Y of mass m and charge e is obtained 
from the uncharged Dirac Lagrangian (41.2) by changing the (torsion-full, in general) covariant derivative 
D to the gauge covariant derivative D + ieA, 


L= %-(D+ieA+m)y. (41.28) 


Symmetrized with its complex conjugate, the charged Dirac Lagrangian (41.28) is 


L=}y:(D+icA+m)yp—4y-(D-ieA+m)y. (41.29) 


If the momentum za conjugate to w is treated as a distinct field as in §41.1.2, then the charged Dirac 
Lagrangian is 


L=3n-(D+ieA)y—47-(D—icA)p+im(-7-r+y-y) . (41.30) 

Varying the action with Lagrangian (41.30) yields Hamilton’s equations for a charged Dirac field, 
(D+ieA+m)(7+v)=0, (D-ieA+m)(r7+)=0, (41.31a) 
(D+ieA—m)(#-wv)=0, (D-ieA—m)(x-¥) =0, (41.31b) 


generalizing the earlier uncharged equations (41.14). Once again, the +m conditions mt = y~ and 7 = w 
emerge as constraint equations if the conjugate momenta m and 7 are treated as fields independent from w 
and w. Under the +m conditions, the Dirac equations (41.31) reduce to 


(D+ieA+m)y=0, (41.32a) 


(D—-—ieA+m)y=0. (41.32b) 


The Dirac equation (41.32b) for the conjugate field Y looks like that (41.32a) for the field y but with opposite 
charge e. 

The charged Dirac field has an electric current j given by the product of the charge e and the conserved 
number current n = Ymn"™, equation (41.19), 


j=en. (41.33) 
Like the number current, equation (41.20), the electric current j is covariantly conserved, 


D.j=0. (41.34) 
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Current conservation (41.34) is a consequence of the invariance of the Lagrangian (41.29) under an electro- 
magnetic gauge transformation (41.25). The electromagnetic contribution to the Dirac Lagrangian (41.29) 
can be interpreted as describing the interaction between the electromagnetic field A and the Dirac electric 
current 7, 


Lin =~ ie AP=A-j. (41.35 

Resolved into components Yp and wy in the Dirac representation, §14.8, the Dirac equations (41.32 
become, generalizing equations (41.9), 

(— iDo + edo + m)y = —Ga(DatieAa)vy , (iDo —eAo +m) py = —oa(Dat+ieAa)yy , (41.36a 

(— iDo — eo +m) = —04(Da — teAa)Yy , (iDo + eAo +m) = —o9G(Da — ieAa)yy. (41.36b 


The charge-conjugate Dirac equations (41.36b) are complex conjugates (with respect to i) of the parent 
equations (41.36a). As discussed in §14.8, a Dirac spinor ~ contains two components, which in the rest frame 
are Yp and wy, that cannot be transformed into each other by any proper Lorentz transformation. The 
two components describe particles and antiparticles. Lorentz-transformed out of the rest frame, particles 
and antiparticles are each linear combinations of both Yy and wy, but still those combinations cannot 
be transformed into each other: for particles, Yp dominates, while for antiparticles wy dominates. The 
first pair (41.36a) of Dirac equations describes the evolution of particles, where yy dominates. The pair 
of equations are coupled first-order differential equations for Yy and wy, which combine to yield a second- 
order equation for pp. Likewise the second pair (41.36b) describes the evolution of antiparticles, where the 
negative-mass component wy, or physically its positive-mass complex conjugate Y}, dominates. The second 
pair (41.36b) combine to yield a second-order equation for y}. The charged Dirac equations (41.36) confirm 
the earlier inference from equations (41.32) that particles and antiparticles have opposite electric charges. 
Resolved instead into chiral components wp and wr, §39.2, the Dirac equations(41.32) are 


|- Do — iedo + Ga(Da + tAa)] YL = -myr , [— Do — iedo — Ga(Da + iAa)| UR = mv , (41.37a) 
|- Do + iedo — ož (Da — iAa)| UR = MYT , |- Do + iedo + ož (Da — ia) Y = —mvR.  (41.37b) 


Again, the charge-conjugate Dirac equations (41.37b) are complex conjugates (with respect to i) of the parent 
equations (41.37a). 


41.3 Particles and antiparticles 


The question of whether a Dirac spinor w describes a particle or antiparticle can be decided from the sign 
of the effective Dirac Hamiltonian, equation (41.6), 


H = -m4 -Y = impt yoy = -mhia — vi dy) - (41.38) 


Whereas the number density n° = ip - y?y = yty of the Dirac field is always positive, equation (41.21), 
the scalar product 7 -p can be either positive or negative. If the spinor is a particle (Wy dominates), then 
w-w is positive, and the Hamiltonian (41.38) with positive m describes a timelike field. If on the other hand 
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the spinor is an antiparticle (Yy dominates), then w - a is negative, and the Hamiltonian would appear to 
describe a spacelike field. (As usual in this book, do not confuse the scalar super-Hamiltonian (41.38) with 
the conventional Hamiltonian, which is the time component of a 4-vector.) 

The antisymmetry of the Dirac spinor scalar product means that the Hamiltonian (41.38) can be rewritten 


H=my-o. (41.39) 


This does not resolve the problem that, if the antiparticle component dominates, the Hamiltonian (41.39) 
is still positive, for positive m, hence spacelike. A timelike Hamiltonian with positive mass m when the 
antiparticle field 7 dominates can be obtained by taking its PT conjugate, yielding the CPT-conjugate 
field. Let fields with an underbar ~ denote the PT-conjugate fields obtained by pre-multiplying by the 
pseudoscalar J, equation (39.134), 


PT: =I, CPT: p=. (41.40) 


The fields y and Y are charge conjugates of each other, equation (39.91), since CI* = IC. Since the 
pseudoscalar satisfies I? = —1 and I commutes with the spinor metric £, the Hamiltonian of the CPT- 
conjugate field vi is 


H=-mv-g, (41.41) 


which is timelike when w - ~ is positive, that is, when the CPT-conjugate field Y dominates. Note that 


introducing an additional phase factor, such as i, into the definition of the PT-conjugate fields makes no 
difference to the Hamiltonian, because the opposing phase factors in w and Y cancel each other. 


41.3.1 C, P, and T symmetries 


The collection of Dirac equations for a spinor 7, its charge conjugate 7, and their PT conjugates y and Y 
are 


(D+ieA+m)w=0, (41.42a) 

PT: (D+ieA—m)yp=0, (41.42b) 
C: (D-ieA+m))=0, (41.42c) 
CPT: (D-ieA—m)p=0. (41.42d) 


The PT-conjugate equations (41.42b) and (41.42d) are obtained by commuting the PT operator I through 
their parent equations (41.42a) and (41.42c), and noting that I anticommutes with the basis vectors y”. 
The PT-conjugate Dirac equations (41.42b) and (41.42d) appear to be flipped in mass m compared to their 
parent Dirac equations (41.42a) and (41.42c), but if the equations are expanded in terms of components py 
and wy, as in equations (41.36), the PT-conjugate equations are identical to their parent counterparts. 
Despite the apparently differing signs of charge e and mass m, the four sets of Dirac equations (41.42) are 
equivalent to each other, an equivalence that is manifest when the equations are expanded in components, 
equations (41.36). The equivalences express symmetry of the charged Dirac equations with respect to the 
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discrete operations of spacetime reversal PT and charge conjugation C. The PT symmetry says that the 
transformation Ym — —Ym and Yy > Iy = y leaves the Dirac equation unchanged. The C symmetry says 
that the transformation e > —e and Y > Cy* = y leaves the Dirac equation unchanged. 

The Dirac equations are also symmetric with respect to the parity operation P. The P symmetry says 
that flipping the spatial axes Ya > —Ya and transforming Y — -yo leaves the Dirac equation unchanged. 

Electromagnetic, colour, and gravitational interactions all respect C, P, and T symmetries, but weak inter- 
actions violate them. Weak interactions act only on left-handed particles (and right-handed antiparticles), not 
their opposite-chiral counterparts. A parity transformation flips chirality (it flips momentum while leaving 
spin unchanged), so weak interactions violate parity symmetry maximally. The excess of matter (baryons and 
leptons) over antimatter (antibaryons and antileptons) in the Universe suggests that T-violating processes 
took place during the early Universe. 

Although C, P, and T may be individually violated, the combination CPT appears to be a general 
symmetry of Nature. There is a CPT theorem premised on the proposition that Lorentz transformations in 
(3+1)-dimensional spacetime can be analytically continued to spatial rotations in 4 spatial dimensions. A 
spatial rotation by m in the Euclideanized t-z plane sends t — —t and z > —z, equivalent to a combination 
of time reversal and parity reversal in 3+1 spacetime dimensions. A spatial rotation preserves scalars, in 
particular the scalar (super-)Hamiltonian (41.38); for the Hamiltonian to remain a scalar in 3+1 dimensions, 
the transformation must be CPT, equation (41.41). 


A2 


The Standard Model of Physics and beyond 


A fundamental piece of the philosophy behind this Chapter is that, at its most fundamental level, spacetime 
is somehow built out of spinors. As found in Exercises 38.3 and 39.5, the algebra of outer products of spinors 
is isomorphic to the geometric algebra. The geometric algebra in K+M space+time dimensions contains 
not only the bivector generators of Spin(K, M), but a complete set of multivectors that together generate 
the complete Lie group of transformations of spinors. The potential importance of multivectors other than 
bivectors is evidenced by Dirac’s (1928) discovery that vectors (multivectors of grade 1) generate spatial 
translations of spinors. 


42.1 Fermion content of the Standard Model of Physics 


This section reviews the fermion content of the Standard Model of Physics (SM), which is based on the 
gauge group Uy(1) x SUL(2) x SU(3), the product of the electroweak group Uy (1) x SU,(2) (which breaks 
down to the electromagnetic group Uem(1) at energies below the electroweak unification scale ~ 100 GeV) 
and the colour group SU(3). An excursion into Grand Unification is irresistible, in part because it helps to 
make sense of the seemingly bizarre pattern of fermion charges, and in part because it presents a practical 
application of super geometric algebras. See Baez and Huerta (2010) for an expository review. 

The SM has 4 conserved charges consisting of hypercharge Y, weak isospin M, (commonly abbreviated 
isospin'), and 2 colours. Colour conservation is commonly described in terms of 3 colours, suggestively 
called red, green, and blue, which satisfy the condition that the sum of the 3 colours is colourless, or white, 
r+g+b= 0. The fermions of the SM have charges listed in Table 42.1. Table 42.1 omits antifermions, 
which have charges opposite to their fermion partners. Antifermions are conventionally denoted with a bar; 
for example, an antineutrino is P (the bar here signifies a fermion with all opposite charges; in §42.4.4 it will 
be seen that the bar also signifies the charge conjugate). Each quark has a colour of r or g or b. Antiquarks 
have opposing colours; for example antired is —r = g + b. Actually, Table 42.1 lists only the fermions of 
1 Weak isospin, or isospin, is often denoted T3, the 3 signifying the 3rd of the 3 Pauli matrices that generate SUL (2); but I 


prefer the designation I, to emphasize that isospin is non-zero only for left-handed fermions (and right-handed 
antifermions). 
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Table 42.1: Conserved charges in the Standard Model 


Uem(1) Uy (1) SUL (2) SU(3) 
Species symbol charge Q = 3Y +I; hypercharge Y isospin Iņ colour c 
Left-handed leptons ( ie ) 0 —1 +3 white 
eu —l 
i 2 
Left-handed quarks ( B ) i f +3 r,g,b 
dt, -3 
Right-handed neutrino VR 0 0 0 white 
Right-handed electron eR —1 —2 0 white 
Right-handed up quark UR 2 4 0 r,g,0 
Right-handed down quark dR -ł -2 0 r,g,b 


the first generation, the electron generation. Altogether there are three generations, electron, muon, and 
tauon, whose charges duplicate those in Table 42.1. The fermions of the three generations are distinguished 
by having very different masses, §42.3. 

The charges in Table 42.1 show some intriguing patterns that suggest that the SM group is a broken 
remnant of some larger group. The three kinds of charge — hypercharge, isospin, and colour — each add to 
zero when summed over all right-handed particles (or all left-handed antiparticles), or over all left-handed 
particles (or all right-handed antiparticles). 

The values of the hypercharge Y in Table 42.1 seem random, but they satisfy 


3Y — 61, +2(r+g+b)=6N, (42.1) 
where N is an integer. As prettily described by Baez and Huerta (2010), the relation (42.1) is precisely such 
as to allow the SM group Uy (1) x SUL(2) x SU(3), modulo the discrete group Ze, to be embedded as a 
subgroup of SU(5), 

Uy (1) x SUL (2) x SU(3) / Ze = S(UL(2) x U(3)) c SU(5), (42.2) 
suggesting that the SM could be a broken remnant of a larger Grand Unified Theory (GUT) group SU(5), 
a possibility first pointed out by Georgi and Glashow (1974). The embedding is 

Uy (1) x SUL(2) x SU(3)/Ze —> S(UL(2) x U(3)) c SU(5) 
a’g 0 (42.3) 
{a,g,h} =? ( 0 a’ 7h ) ’ 


in which the hypercharge phase a arises as a relative phase between elements of UL (2) and U(3). The choice 
of powers of a in the mapping (42.3) into S(UL(2) x U(3)) is consistent with the requirement that the 
determinant on the right hand side be one, (a?)?(a~?)? = 1 (don’t forget that g and h are respectively 2 x 2 
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and 3 x 3 matrices each of unit determinant, so the determinants of a°g and a~2h are aê and a~°). The 
map (42.3) is modded by Zę because if z is any sixth root of unity, then the element {z, diag z273, diag z7} € 
Uy (1) x SUL(2) x SU(3) (both diag z~3 € SUL(2) and diag z? € SU(3) have unit determinant) maps to the 
same unit element {1,1} of S (UL (2) x U(3)). The mapping (42.3) is viable only if the kernel Zg acts trivially 
on all fermions of the SM. But the relation (42.1) ensures precisely this. The action of the sixth root of unity 
z on a fermion w of hypercharge Y, isospin I, and colour r or g or b is 


{2,273,222} > (2)? (273) (22) t9tby, et got -6IL +2(r+g+b) y, =4. (42.4) 


The factors of 3 in 3Y and 2 in 2[; in the exponents arise because hypercharge and isospin are quantized in 
units of respectively 3 and 3; the choice of exponents ensures that a unit phase factor z = e 
on all fermions for each of the Uy(1) x SU,(2) x SU(8) factors individually. 

But there are other patterns among SM particles that SU(5) does not explain: right-handed particles look 
like they should group into SUg (2) doublets like their left-handed counterparts; and neutrinos and electrons 
look like they could be another species of up and down quark with a 4th colour. As it happens, as first 
pointed out by Pati and Salam (1974), the SM group, modulo the discrete group Z3, extends as a subgroup 


along precisely these lines, 


2nt acts trivially 


Uy (1) x SU, (2) x SU(3) /Z3 C SUR(2) x SU; (2) x SU(4) . (42.5) 


Consider treating the right-handed leptons and quarks as SUr(2) doublets labelled by right-handed isospin 
Ir, similar to their left-handed counterparts. Consider also treating white as a 4th colour w. The SM particles 
in table 42.1 satisfy 


3Y — 6Ig + 8w—- (rt+g+6)=0. (42.6) 
The pattern suggests an embedding 
Uy (1) x SU(3) /Z3 > SUR(2) x SU(4) 


wo (PRCT ap 


The map (42.7) implies that for example left-handed leptons and quarks (which transform trivially under 
SUR(2)) transform under Uy (1) respectively as a~? and a, implying hypercharges —1 and z, in agreement 
with Table 42.1; similarly, right-handed up leptons and quarks transform as a° and a“, while right-handed 
down leptons and quarks transform as a~° and a~?, implying hypercharges 0, $, —2, and -2, again in 
agreement with Table 42.1. The map (42.7) is into only if Uy (1) x SU(3) is modded by Z3, because if z is 
any third root of unity then {z, diag z~!} € Uy (1)x SU(3) maps to the same element {1,1} of SUg (2)x SU(4). 

Exercises 42.2 and 42.3 show that SU(2) x SU(2) is isomorphic to Spin(4), while SU(4) is isomorphic to 
Spin(6). Consequently the Pati-Salam group on the right hand side of the embedding (42.5) is isomorphic 
to Spin(4) x Spin(6), 


SU, (2) x SUR(2) x SU(4) S Spin(4) x Spin(6) . (42.8) 


As discussed in Exercise 38.3, spinors in 2N dimensions are linear combinations of 2" basis spinors €a 
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labelled by an N-component bitcode a = ay...ay with each of a; being up + or down |, equation (38.85). 
As discussed in part 18 of Exercise 38.3, SU(N) is a subgroup of Spin(2N), and the spinor bitcode also 
encodes the indices of SU(N) multivectors. In the Pati-Salam model, the Spin(4) factor is associated with 
isospin, and particles can be labelled with spinor bitcodes — (blank), d, u, and du. The same spinor bitcodes 
encode the transformation of spinors under the SU,(2) subgroup: — (blank) is an SUL(2) scalar, d and u are 
SU, (2) vectors, and du is an SU, (2) pseudoscalar. Similarly, under the SUr (2) subgroup, — (blank) and du 
are SUg (2) vectors, while d and u are respectively an SUp(2) scalar and pseudoscalar. If each bit is assigned 
the value +4 or -4 according to whether it is up or down, then left-handed isospin is Jr, = $(u — d), while 
right-handed isospin is Ip = $(u+ d). Of the fermions listed in Table 42.1, together with their corresponding 
antifermions, there are 16 that transform under the left SU,(2) isospin group (but not under SUR(2)), 
namely the left-handed leptons and quarks and right-handed antileptons and antiquarks, and 16 that do 
not transform under SU,(2) (but do under SUp(2)), their partners of opposite chirality. The following 
chart (42.9) labels the fermions with their Spin(4) spinor d, u bitcodes: 


- d,u du 


DL, eR, UL, dR d: DR, eL, UR, d, VR, EL, UR, dL (42.9) 


u: V, R, UL, dR 


The Spin(6) factor of the Pati-Salam group is associated with colour, and particles can be labelled with 
a spinor bitcode r,g,b. Each quark d° or uê of colour c = r,g,b is labelled by a single bit r, g, or b. Each 
antiquark d? or u° is labelled by the bit-flipped bitcode Z = gb, rb, rg (antired, antigreen, antiblue, or cyan, 
magenta, yellow if you prefer) of the quark colour c. The leptons v and e are labelled white rgb, and the 
antileptons 7 and é by black — (blank, antiwhite). Again, the same spinor bitcodes encode the transformation 
of spinors under the SU(3) colour subgroup: — (blank) is an SU(3) scalar, r, g, and b are SU(3) vectors, gb, 
rb, and rg are SU(3) pseudovectors, and rgb is an SU(3) pseudoscalar. The following chart (42.10) labels the 
fermions with their Spin(6) r, g, b spinor bitcodes: 


= c=r,g,b C= gb,rb,rg rgb 


D D. È dE >r dE (42.10) 
L,R» €L,R ULR> L,R ULR L,R VL,R; L,R 


Both the SU(5) embedding (42.3) and the Pati-Salam embedding (42.5) can be accommodated consistently 
within an even grander group Spin(10), as originally proposed by Georgi (1975) and Fritzsch and Minkowski 
(1975). The group Spin(4) x Spin(6) embeds naturally in Spin(10): 


Spin(4) x Spin(6) / Z2 — Spin(10) . (42.11) 


The mapping is mod Zə because flipping the signs of both Spin(4) and Spin(6) rotors leaves the Spin(10) 
rotor unchanged. Through the mapping (42.3), the multivector SUL(2) and SU(3) bitcodes map naturally 
to a multivector SU(5) bitcode d,u,r, g,b, which through the natural mapping (42.11) encodes the particles 
in Spin(10). The two charts (42.9) and (42.10) assemble into the following chart, organized by the grade 
p (number of up bits) of the Spin(10) spinor bitcode labelling the fermion (compare Table 4 of Baez and 
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Huerta (2010)): 


0 1 2 3 4 5 
-: D d: DR é: ut dé: uk urgb: Vy, durgb: vp 
u: ER du: &, rgb: eR drgb: e, (42.12) 
c: dR dc: dt. uc: dé, due: dé 
uc: Uy, duc: up 


As in the Spin(6) chart (42.10), the colour index c on each quark d° or u° runs over c = r, g, b, while the anti- 
colour index é on each anti-quark d? or u? runs over € = gb, rb, rg. Each of the 32 fermions and antifermions of 
the SM is described uniquely by the Spin(10) d, u,r, g, b code, so Spin(10) provides a complete unification of 
the SM fermions within each of the 3 generations. The tth column of the chart (42.12) is an SU(5) multivector 
of grade i, that is, an antisymmetric SU(5) tensor of rank i. The dimensions of the columns are 1, 5, 10, 10, 
5, 1. SU(5) transforms the components of each column into each other, but does not transform components 
across columns. Thus SU(5) constitutes only a partial unification of the fermions within a generation, in 
contrast to Spin(10) which unifies all 32 fermions within each of the 3 generations. 

There is no experimental evidence for a right-handed neutrino vg or its antiparticle P. SU(5) does not 
require those particles, because they transform as SU(5) scalars, and are therefore unrelated to the other 
fermions. By contrast, Spin(10) requires a right-handed neutrino and its antiparticle. 

It might seem that Spin(10) does not quite unify all the spinors of the SM, since rotations in the 10- 
dimensional space leave the Spin(10) handedness of the spinor unchanged. From the perspective of Spin(10), 
the spinor is right-handed if all its five bits are up, or more generally if an odd number of its bits are up. The 
right-handed spinors in the bitcode chart (42.12) are those in the columns with 1, 3, and 5 bits up, while 
the left-handed spinors are those in the columns with 0, 2, and 4 bits up. 

But the chart (42.12) indicates that the separation of the spinors into two sets under Spin(10) is simply the 
separation into particles and antiparticles. Mathematically, antiparticles are CPT conjugates of particles, 
and CPT appears to be an exact symmetry. In conjunction with CPT, Spin(10) unifies all the 32 spinors of 
a generation. 

The presence of 3 generations of fermion — electron, muon, and tauon — suggests that perhaps there 
should be an even larger Grand Unified group than Spin(10). However, the fact that the 3 generations differ 
only in the masses of their particles, and that the 3 generations share the same gauge fields (there are not 
multiple generations of gauge fields), admits the alternative hypothesis that the 3 generations are, somehow, 
just different excitations of the same intrinsic object, similarly, perhaps, to that way that atoms and nuclei 
have excited states. 


42.1.1 Spin(10) charges 


Spin(10) reorganizes the charges of the Standard Model in an interestingly different and elegant way. The 
usual SM charges are hypercharge Y and isospin Iņ, and colours r, g, and b. Spin(10) reorganizes the 5 
charges as a bit code durgb with each bit (charge) taking values either +5 (f) or —4 ({) for each of the 
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2° = 32 fundamental fermions of a generation. The relation between SM charges and Spin(10) charges is 


Y=u+d-— 3 (rio + gio + bio) , (42.13a) 
I, =3(u-d), (42.13b) 
c=cio ty (c=r,g,b). (42.13c) 
The electromagnetic charge is 
Q = 3Y +I =u — 3 (r10 + g10 + bio) - (42.14) 


The subscripts 10 on the colour charges cio (with c one of r, g, b) on the right hand sides of equations (42.13) 
distinguish the Spin(10) colour charge from the traditional SM colour charge c. The Spin(10) durgb charges 
on the right hand sides of equations (42.13) are to be interpreted as +4 if the corresponding bit is up ¢ and 
—4 if down |. For example, equations (42.13) imply that the all-bit-down and all-bit-up fermions Pi (44444) 
and vr (MTTM) have SM electroweak charges Y = I, = 0, and SM colour charges respectively 0 (black) and 
rgb (white). 

Traditionally a quark has colour charge consisting of one unit of either r, g, or b. Spin(10) on the other 
hand says that an r quark (for example) has rgb bits t{|, meaning that its rio charge is +4 while its g10 
and bio charges are —t. In the Spin(10) picture, when an r quark turns into a g quark, its rgb bits flip from 
tL, to JN, meaning that its r1o charge flips from +4 to -4 while its gio charge flips from -4 to +4. In so 
doing, the quark loses one unit of r charge, and gains one unit of g charge, consistent with the traditional 
picture. 

Equations (42.13) invert to yield Spin(10) charges in terms of SM charges, 


d=$5Y -l +4(r+g+b)- 4, (42.15a 
u=Q+4lr+g+b)- 4, (42.15b 
cao=c-4 (c=r,g,b). (42.15c 


The d charge can also be expressed in terms of the Pati-Salam right-handed isospin Ip = $(u +d) as 


d=In—-t,. (42.16 


The SM also preserves baryon number B and lepton number L, quarks being assigned baryon number 1 and 
zero lepton number, and neutrinos and electrons being assigned lepton number 1 and zero baryon number. 
Spin(10) does not preserve baryon and lepton number individually, but it does preserve their difference B— L, 


B L= 2 (rio t 910 } bio) . (42.17) 


The sum of Spin(10) charges defines an X-charge (some works normalize X differently) 
X=d+utriot gio +b =Y -2(B-L). (42.18) 


At low energies, the SM gauge group Uy(1) x SU (2) x SU(3) breaks down to Uem(1) x SU(3), in which 
only the electromagnetic charge Q and the colour charges r, g, b are conserved. In terms of Spin(10) charges, 
equations (42.15), this means that the d charge ceases to be conserved, while u, r, g, and b charges continue 
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to be conserved. The u charge can be thought of as a fourth colour, but it is not the same as the fourth colour 
contemplated by Pati and Salam (1974). Treating u as a fourth colour means considering an embedding of 
Uem(1) x SU(8) in SU(A), 


Tak) = a ai . (42.19) 


which is similar to but not the same as the Pati-Salam embedding (42.7). The map (42.19) is into only if 
Uem(1) x SU(3) is modded by Zs, because if z is any third root of unity then {z, diag z} € Uem(1) x SU(3) 
maps to the same element {1} of SU(4). 

In accordance with the theorem of Atiyah, Bott, and Shapiro (1964) (see part 18 of Exercise 38.3), and 
similarly to the embeddings of SU,(2) in Spin(4) based on the d, u bits, chart (42.9), or of SU(3) in Spin(6) 
based on the r,g,b bits, chart (42.10), or of SU(5) in Spin(10) based on the d,u,r,g,b bits, chart (42.12), 
there is an embedding of SU(4) in Spin(8) based on the u, r, g, b bits. The following chart labels the fermions 
with their Spin(8) u,r,g, b bitcodes: 


0 1 2 3 4 
-: DLR u: ELR T: WLR rgb: eLR urgb: VLR (42.20) 
c: dip uc: ULR uč: dip 


Compared to the Spin(10) chart (42.12), the Spin(8) chart (42.20), having lost the d-bit, lumps left- and 
right-chiral species of fermions into the same box. 


42.1.2 Spin(10) gauge fields 


The 10 orthonormal basis vectors -y;“, i = d,u,r,g,6, of the geometric algebra associated with Spin(10) are, 
in terms of chiral basis vectors y; and z, 
af a erR Vi — Nr 
‘ V2 V2i 
The Spin(10) chiral basis vectors y; and yz are analogous to the vectors y+} and -y_ in the Newman-Penrose 


formalism, equations (39.1). A chiral basis vector ~y; and its conjugate ~z have i-spin weight +1 (they vary 
Fið 


and y, = (42.21) 


by e+’”’ under a right-handed rotation by angle 6 in the yj —y, plane), so carry respectively plus and minus 
one unit of i charge. The chiral basis vectors y; and z respectively raise and lower the charge of a spinor by 
one unit of i charge. If €; and €z are basis spinors whose i-bit is respectively up and down, then (note that 
yi and yz multiply by v2 while raising and lowering the i-bit of their argument, equations (38.111)): 

aw, ae ps. (42.22a) 
Ya. Ne 


7=0. 42.22b 
Jae ( ) 


42.1 Fermion content of the Standard Model of Physics 1049 


The gauge fields associated with any gauge group form a multiplet labelled by the generators of the group. 
The generators of the Spin(2N) group with N = 5 are its N(2N—1) = 45 orthonormal basis bivectors 
(products of orthonormal vectors) comprising the 2N(N—1) = 40 bivectors 


vi Ave = EROA +9) » (222 
Vi AG = alit HNG =N) » aes) 
Ve AYP =a RANNE) (eee 
AWG =—a(%—%) ACG — 9) » (42.23d 
with distinct indices į and j each running over d,u,r,g,b, together with the N = 5 bivectors 
bat nar ENAN, (42.2 


with indices 7 running over d,u,r,g,b. The normalization factor of 4 in equation (42.24) is introduced so 
that the diagonal chiral bivectors SVAN measure correctly the charge of the object they act on (see 
equation (42.49)). Off-diagonal chiral bivectors 5 y; A7; increase the charge of the object they act on by one 
unit of i charge and one unit of j charge (see again equation (42.49)). 

The generators of a gauge group serve two roles. On the one hand they generate the symmetries that 
rotate fields. On the other hand, the generators are themselves fields that are rotated by the symmetries 
they generate. To appreciate the distinction, consider the diagonal chiral bivector 


SANG - (42.25) 


On the one hand, the diagonal bivector acts as an operator whose eigenvalues equal the ¿i charge of the 
objects it acts on. On the other hand, the diagonal bivector is itself an object whose i charge is zero. As an 
operator, a generator acts on its argument by commutation (equivalent to multiplication, if the argument is 
a column spinor, since a columm spinor times a multivector is zero). As a field, a generator is itself acted on 
by commutation. These assertions will become clearer in §42.2. 

The Standard Model gauge group Uy(1) x SUL(2) x SU(3) is a subgroup of the SU(5) subgroup of 
Spin(10). The gauge fields (generators) of SU(5) comprise the subset of gauge fields of Spin(10) that leave 
the number of up bits of a spinor unchanged. The gauge bivectors of SU(N) with N = 5 constitute (compare 
equations (38.180)) (N+1)(N—1) = 24 bivectors comprising the N(N — 1) = 20 bivectors 


31 ati AG = 3d Arg tp AG) = FHA + AY) » (42.26a) 
E- say Ary = FT Avy —% AT) = ANNAT), (42.26b) 

and the N—1 = 4 bivectors 
SU AM =i modulo 4X An =i VAN, (42.27) 


with indices i and j running over d, u, r, g, b. The quantity mij = YAPAY AY7 = =y AY, ny7 NY 
in equations (42.26) is the ij chiral operator. The factor (1 — ij) is a projection operator, whose square is 
itself, which serves to project its argument into the space where the sum of 2 and j charges is zero. The S in 
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SU(5) restricts to U(5) matrices of unit determinant, effectively removing the bivector 4 X; y} Aq; that 
rotates all spinors in an SU(5) multiplet by a common phase. 
The bivectors of Spin(10) that are not in SU(5) are the 20 bivectors 


gt gly AG = 3 Ave — 9G AG) = FAW HAY) | (42.28a) 
pL + mas Avg = ATANG ta AY) = SHA nA), (42.28b) 

and the 1 bivector 
iX=hyo yaw = Des (42.29) 


with indices 7 running over d, u,r, g, b. The bivector X measures total durgb charge, and iX is the generator 
of the U(1) factor that would complete SU(5) to U(5). 

The gauge fields of the SM gauge group Uy(1) x SUL (2) x SU(8) are labelled by 1+3+8 = 12 bivectors. 
The bivectors of the isospin group SU,(2) comprise the 2+(2—1) = 3 bivectors (42.26) and (42.27) with 
i and j running over d and u, while the bivectors of the colour group SU(3) comprise the 6+(3—1) = 8 
bivectors (42.26) and (42.27) with ¢ and j running over r, g, and b. For some purposes it can be convenient 
to recast the 3 bivectors of SUL(2) in terms of three weak Pauli generators ir; defined by 


in= (l-au) Na = s00% an) = aN Yatna) : (42.30a) 
img = —3(1 — xau) Yi Vi =- lVi Ma +a Vu) = a (Va A Ya — Vu NYa) » (42.30b) 
img = —3(1 — xau) Yd Ya =li Ya Vi Vua) = 3u NYa — Va Na) ; (42.30c) 


where xau = Ya ^ Yg ^ Yu A Ya is the weak chiral operator. The left-handed projection operator (1 — Hau) 
equals 1 acting on left-handed weak chiral states, and vanishes acting on right-handed weak chiral states. 
The weak Pauli matrix 73 has eigenvalue equal to twice the isospin 2/,, = u — d, equation (42.13b). The 
squares of the weak Pauli matrices are T? = T2 = T = za — du), which again is 1 acting on left-handed, 0 
acting on right-handed states. 

The 1 hypercharge bivector, the generator of Uy (1), is defined to be the bivector whose eigenvalue is iY 


where Y is the hypercharge, equation (42.13a), 


VSS AN Ha eae =h E vAn-4 > mam) ; (42.31) 


i=d,u i=r,g,b i=d,u i=r,g,b 


After electroweak symmetry breaking, the gauge fields of the remaining unbroken gauge group Uem(1) x 
SU(3) are labelled by 1+8 = 9 bivectors. The 8 bivectors of the colour group SU(3) are the same as those in 
the SM. The 1 electromagnetic charge bivector, the generator of Uem(1), is defined to be the bivector whose 
eigenvalue is iQ where Q is the electric charge, equation (42.14), 


Q= iyan > at Ang =i( Frente i 5 mam) (42.32) 


i=r,g,b i=r,g,b 
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Physicists commonly discuss symmetry groups and unification in terms of representations. It is helpful 
to translate the present approach, which is based on the super geometric algebra, into the language of 
representations. The account in this section is compressed; several results are quoted without proof. See 
Slansky (1981) for a pedagogical review in the context of unification. 

The gauge groups of physics are continuous groups that act linearly on fields, preserving inner products. 
That means symmetry transformations are unitary, and generators of symmetry transformations are Hermi- 
tian or skew-Hermitian. The symmetry groups are then Lie groups, whose generators S4 satisfy commutation 
relations of the form 


[S4, SB] = fasoSe . (42.33) 


The complex coefficients f4gc are called the structure coefficients of the group. 

The classification of all finitely generated Lie groups was completed by Cartan in 1894 (see Hawkins (2000) 
for a historical review), and made transparent by Dynkin in 1946 (Dynkin, 1962). There are four infinite 
sequences of finitely generated irreducible Lie groups, commonly denoted An, Bn, Cn, and Dn, related to 
the traditional special unitary (SU), spin (Spin), and symplectic (Sp) groups by 


An = SU(n4+1), Bn =Spin(2n+1), C,=Sp(2n), Dn = Spin(2n) . (42.34) 


In addition, there are 5 exceptional groups, denoted G2, F4, Ee, E7, and Es. 
Let S4 be the generators of a continuous group G, and let v;, i = 1,...,d be a set of d linearly independent 
vectors that transform linearly into each other under the action of the group, 


SA i Ua (SA)ijVj ‘ (42.35) 


The d vectors v; and the accompanying set of d x d matrices (.54);; define a d-dimensional representation of 
the group. The dimension of the representation is defined to be the dimension d of the vector space, 


dim(rep) =d. (42.36) 


A representation is said to be irreducible, or simple, if the vector space contains no proper non-trivial subset 
of vectors that transform exclusively among each other under the action of the group. Physicists often refer 
to a representation by its dimension. For example, the spinor grade p representations of SU(5), the columns 
of the Spin(10) chart (42.12), are 1, 5, 10, and their conjugates 10, 5, 1. 

The adjoint representation of a group is the special representation where the vectors upon which the group 
acts are the group generators themselves. The generators of a Lie group act on each other by commutation, 
and the adjoint representation is the set of matrices (S4) gc satisfying 


[Sa, SB] = (Sa)BcSc . (42.37) 


Evidently the matrices of the adjoint representation are equal to the structure coefficients, (S4)ec = fABC- 
The dimension of the adjoint representation of a Lie group equals the dimension of the group itself, the 
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number of its generators, 
(n+2) SU(n+1) 
(2n+1) Spin(2n + 1) 
n(2n +1) Sp(2n) 
n(2n— 1) Spin(2n) 


dim(adj) = dim(G) = (42.38) 


Given a group, it is always possible to choose a basis of group generators that is orthonormal in the sense 
that the trace of matrix products of generators S4 and Spg is proportional to the unit matrix 64, in any 
representation of the group (this assertion is not self-evidently true; but it is manifestly true in the examples 
based on Spin(N) and its subgroups considered below), 


Tr(S4Sp) = (Sa)iz(SB) 52 = So(rep)daB : (42.39) 


The constant of proportionality defines the Dynkin index S2(rep), a real number whose value depends on 
the representation. The structure coefficients f4gc in an orthonormal basis are totally antisymmetric. 

In an orthonormal basis, the antisymmetry of the structure coefficients implies that the sum $` 4 S4 of 
matrix products of generators commutes with all generators, and is therefore proportional to the unit matrix, 


5 S = NO (SA) (SA)jk = Co(rep)ôik . (42.40) 
A A 


The coefficient Co (rep) is called the quadratic Casimir invariant. For Spin( N), the quadratic Casimir invariant 

is the total angular momentum squared of the representation. Equating the trace of equation (42.39) over 

generator indices A with the trace of equation (42.40) over vector indices į implies that the Dynkin index 

So is related to the Casimir invariant Cə by 
dim(rep) 

S2(rep) = —————C (rep) . 42.41 

2(rep) dim(G) 2(rep) (42.41) 

The orthonormal generators of Spin(V) are its N(N—1)/2 bivectors Yab] = $a ^ Yo With distinct indices 

a and b running over 1 to N. The normalization factor of i is introduced so that charges of eigenvectors 

upon which the generators act differ by integer increments, for example equation (42.49). The non-vanishing 
commutators of the orthonormal bivectors are 


lE Yab) Ebe] = $Y fac} - (42.42) 


The commutators (42.42) imply that the non-vanishing structure coefficients are 


F[ab][bc][ca] = —1 (42.43) 


for any a Æ b # c #a. The structure coefficients fjabjfoc][ca] are totally antisymmetric in their indices [ab], 
[bc], and [ca]. 

The bivector generators are operators that act on the vectors of a representation. To characterize and 
construct representations, it is advantageous to work with the chiral representation of the bivector generators, 
since these provide raising and lowering operators that connect the vectors of a representation. Spin(V) has 
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n = [N/2] mutually commuting bivector generators, whose eigenvalues are n conserved charges. In place of 

orthonormal indices a = 1 ,..., N, the chiral bivectors of Spin(V) use chiral indices i and 7 with i = 1,...,n, 
plus a final index N when N is odd. In a chiral basis, the N(N — 1)/2 bivectors collect into: 

iyi n diagonal bivectors that measure charge i , (42.44a) 

Eiis] ; Eyyy j $Y ci) f Eya 2n(n — 1) bivectors that raise and/or lower 2 charges i and j, (42.44b) 

SVN] , iyan] if N is odd, 2n bivectors that raise or lower 1 charge i. (42.44c) 


The non-vanishing commutators of the chiral bivectors are 
[Me Eyr] = Euk » (42.45) 


and the same with i 4+ 7 and/or j + j and/or k © k, and allowing i = j or j = k or k = i, but excluding 
i = j =k. The commutators (42.45) imply that the non-vanishing chiral structure coefficients are 


funukuk = 1 (42.46) 


for any i # j # k # i. The chiral structure coefficients fuzpjjkjuk] are not totally antisymmetric in their 
indices. 
Suppose that |m) is an eigenvector with i-charge mj, 


3 YeqlM) = mi|m) . (42.47) 
Suppose further that the operator + Via) acting on the eigenvector |m) yields another vector |m’), 
a Vigil) = |m’) . (42.48) 
Then the i-charge of |m’) is 
Seq’) = ana unl) = (Se a Me + [SMe Yul) Im) = (mi + 1)m’) , (42.49) 


that is, the raising operator its increases the i-charge of whatever it operates on by 1. The same raising 
operator 5 Vis) similarly increases the j-charge of whatever it operates on by 1. If i is changed to 7, the 
operator lowers i-charge by 1, and if j is changed to J, the operator lowers j-charge by 1. The change of 
charge by increments of 1 explains the factors of 4 in the choice of normalization of bivector generators. The 
eigenvectors |m) and |m’) must be orthogonal because they have different charges. The normalization of 
eigenvectors can be deduced from recurrence relations of the form 


0 = (m| [3 Yq» 2 Mal alm) = (mitia) — (maval amam) — (m| FY y|M) . (42.50) 


The vectors of a representation may be obtained by starting at one vector in the lattice of charges, and 
successively applying raising and lowering operators until all vectors of the representation are found. The 
Cartan-Weyl-Dynkin approach is to use a judiciously chosen minimal subset of n raising operators and n 
complementary lowering operators. The group SU(n) is the subgroup of Spin(2n) that preserves the sum of 
all charges, so only n — 1 raising operators are needed to fill out a representation of SU(n); equivalently n 
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Figure 42.1 Cartan-Weyl-Dynkin raising operators for the groups SU(3), Spin(6), Spin(7), and Sp(6) with 3 conserved 
charges. The 3 charges point along orthogonal axes labelled 1, 2, and 3. All four groups share raising operators along 
the 12 and 23 directions, drawn in black. The dashed black line is added to bring out the fact that the black lines 
form edges of an equilateral triangle. The groups are distinguished by their final raising operator. SU(3) conserves the 
total of the 3 charges, so only the 2 black raising operators are needed. The red, blue, and purple+blue lines are final 
raising operators for respectively Spin(6), Spin(7), and Sp(6). 


raising operators are needed by SU(n+1). For the groups SU(n+1), Spin(2n+1), and Spin(2n), the Cartan- 
Weyl-Dynkin basis of n raising operators is 

InnFi] SU(n+1) 

[nN] Spin(2n+1) . (42.51) 
In-1n}] Spin(2n) 


iyu, $23); e 5 Y[n—1 A] » 


NIHNIHN| = 
2 2 2 


The lowering operators are their complements; for example the complement of x13) is Eyer- The Cartan- 
Weyl-Dynkin raising operators (42.51) can be regarded as vectors œ; which shift the charge of a vector 
through an integrally-spaced lattice of charges, 


..,1,—1} SU(n+1) 


0 
1, ans Qn = {1,—1,...0..}, {0,1,1 ,.0...}, o, o PAA) (42.52) 
0 


seg 2} Sp(2n) , 
1,1} Spin(2n) 


where ...0... denotes a (possibly empty) sequence of zeroes. The corresponding lowering vectors are —a,;. 
The raising vectors for the case where there are 3 conserved charges are illustrated in Figure 42.1. Notice 
that the vectors a; for SU(n+1) are in an (n+1)-dimensional space of charges, whereas the vectors for the 
other groups are in an n-dimensional space of charges. For SU(n+1), the sum of all n + 1 charges can be 
taken without loss of generality to be zero, since the total charge separates into a commuting U(1) generator 
characterized by a charge {1,1,...,1}. Spin(2) has 1 conserved charge, and zero raising operators; it is 
isomorphic to U(1). 

The representation is built up by applying vectors œ; successively. The charges m; of a vector in the charge 
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lattice may be expressed in terms of the net number 2A;/(a;-a@;) of applications of œ; needed to reach the 
vector from the origin, 


2ri 


Here the scalar product a; - a; is the Euclidean scalar product on the n-dimensional (or (n+1)-dimensional 
for SU(n+1)) lattice of charges. The scaling factor 2/(a; - œi), which equals 1 for all but at most one 
of the vectors a; in the set (42.52), is introduced to simplify the subsequent definition (42.54) of Dynkin 
coordinates. For any representation, there is always a lowest vector with the smallest possible charge, and 
a highest vector with the largest possible charge. The vector with the smallest charge is annihilated by all 
lowering operators; the vector with the highest charge is annihilated by all raising operators. The vectors of 
a representation can be obtained by starting with the highest (or lowest) vector and successively applying 
lowering and raising operators in all possible ways until the lowest (or highest) vector is reached. 

Suppose that A; is the highest vector of a representation. Only some choices of highest vector A; yield 
viable representations. Dynkin’s key trick is to introduce Dynkin coordinates àf dual to the components À; 
(implicit sum over paired indices, one up and one down), 


ri = gj? 3 AŻ = gd; r (42.54) 

where the symmetric Dynkin metric gi; and its inverse gf are defined by 
ee ea S i? a Sor j 42.55 
95 = — (œi: œj) z o PE a ak Qj ea, (no sum over i or j) . (42.55) 


The key result of Dynkin theory is that the Dynkin coordinates Af of every vector of a representation are 
integers. Every sequence of non-negative integers Af = {A!,...,\"} defines a highest vector that gives rise 
to a distinct representation, and every representation is characterized by such a sequence of non-negative 
integers. A single step a; changes the Dynkin coordinates A’ of a vector by 

;; Lj OL; 2 


(AA); = g” E =g (no sum over i or j) , (42.56) 


which is called the Cartan-Weyl matrix. The Cartan-Weyl matrix has all integer entries, consistent with the 
fact that the Dynkin coordinates ’ of any vector of a representation are always integers. The charge of a 
vector with Dynkin coordinates Af is, from equations (42.53) and (42.54), 


; 205 
mMk = 5 A’ gij Tjk . (42.57) 
a 


Qj x Qj 
The quadratic Casimir invariant, equation (42.40), of a representation whose highest vector has Dynkin 
coordinates ° is 
C(A) =J Agy +27) , (42.58) 
ij 


where 2f denotes the vector {2,2,...,2}. 
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The dimension of a representation with highest vector \' is given by Weyl’s formula (42.63) below. The 
formula depends not only on the Cartan-Weyl-Dynkin basis set of n raising operators (42.51), but on the 
full set of raising operators. The raising and lowering operators of a group divide into two equal sets, raising 
operators, and their complements, lowering operators. For Spin(/V), the raising operators from the list (42.44) 
are iuz with 7 < j, and SV is) also with 7 < j without loss of generality. For N odd, the raising operators 
include iqu nj- The number nyaise of raising operators is 


in(n+1) SU(n+1) 
n? Spin(2n+1) 
n? Sp(2n) 
n(n—1) Spin(2n) 


(42.59) 


Nyaise = 


The raising operators are characterized by nraise charge vectors œ; whose components in the Euclidean space 
of charges are 


{...0...,1,..0...,-1,...0...} $2(n+1) vectors SU(n+1), 
{...0...,1,..0...,-1,...0...} $n(n—1) vectors Spin(2n+1), Sp(2n), Spin(2n), 
a, = {...0...,1,...0...,1,...0...} $n(n—1) vectors Spin(2n+1), Sp(2n), Spin(2n), (42.60) 
{...0...,1,...0...} n vectors Spin(2n+1), 
{...0...,2, ...0...} n vectors Sp(2n). 


Express the charges a,j; of the raising operators a; as linear combinations of the charges of the Cartan-Weyl- 
Dynkin basis operators œk; given by equation (42.52), 


m 
2Qik 
az =>, a a Ok; ; (42.61) 
k=1 


which defines the nraise X Nn (Or Nraise X (N+1) for SU(n+1)) matrix aik- The nraise rows a; of the matrix aik 


are 


{iO 5 Lee Os} 3n(n+1) vectors SU(n+1), 
E Levey Oe} 5n(n—1) vectors 
{---0..., liye) sail) vectors Spin(2n+1) , 
ser ree Orem n vectors 
{oa Qees 5 Levey Oo} 5n(n—1) vectors 

BON. od Oi sny Losey 2 ves} 3n(n—1) vectors Sp(2n) , (42.62) 
To Oea 232+ n vectors 
PE EE Lives O zn(n—1) vectors 
{.-.0..., 1... ,2...,1,1} 5(n—2)(n—3) vectors . 
1008 pods. 505,1} n—1 vectors Pome) 
f ccQrae gloss Ly 1} n—2 vectors 


42.2 Representations of Lie groups 1057 


Table 42.2: Example representations 


Group Representation Spinor? Grade Dimension C2 So 
All {0...} 0 1 0 0 
p n+1—p)(n n— 
SU(n+1) f Oat 01:0...} V psn (“pt) petai? (21) 
p n+1l—p ele n 2 n+2— n n 
(Mag Wala loud Me De ple opp) Se a) 
P 7i 2 n n 
a ee ee aa x wentl (t?) n+1 2(5tt) er) 
Spin(2n+1) {...0..., 1} Vv — 2” ¢n(2n+1) oo 
p 
6 Mae 1, 00:0...} x pen-l (~P) p(2n+1—p) Ta E 
{...0...,2} x n (r) n(n+1) a) 
Spin(2n) {...0...,1,0} or {...0...,0,1} vV — gn-t ¢n(2n—1) oe 
p A 
Mig 1,00} x p<n-2 ea p(2n—p) aee) 
{...0...,1,1} x n—1 (22) (n—1)(n+1) IC) 
{...0...,2,0} or {...0...,0,2} x n 4 (2n) n? Ce) 


denote a non-empty sequence of it’s. Weyl’s formula for the dimension of a representation with highest vector 
AÍ is 
Nraise n (XÍ J 
dim(\’) = II Bia a , 
i=1 Dj Gij 


(42.63) 


where 1/ denotes the vector {1,..., 1}. 

Two powerful features of Cartan-Weyl-Dynkin theory are that (1) a Lie group can be visualized in terms of 
its diagram, Figure 42.1 for example, and (2) a Lie group is characterized by its Dynkin metric. Two groups 
are isomorphic if and only if they have the same Dynkin metric (after a possible permutation of Dynkin 
coordinates). An example is the isomorphism between SU(4) and Spin(6), Exercise 42.3. 

The Dynkin metric gij of a Lie group that is a direct product of Lie groups with metrics g;;(1) and gj; (2) 


is the block diagonal metric 
gj) 0 ) 
gi = . 42.64 
’ ( 0 gig(2) ne 
If the Dynkin metric of a Lie group is block diagonal, then the group is a direct product. An example is the 
isomorphism between Spin(4) and SU(2) x SU(2), Exercise 42.2. 
The question of whether a Lie group is a direct product of groups can be determined by inspection from its 
diagram. If the Cartan-Weyl-Dynkin raising operators of the group split into two sets that are orthogonal to 
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each other (which is the same as the condition that the Dynkin metric is block diagonal), then the group is a 
direct product. For example, the diagram of Spin(4) consists of the red line and the one black line orthogonal 
to it in Figure 42.1. Therefore Spin(4) is isomorphic to the product of two groups each of whose diagrams 
consist of a single line, namely SU(2). 

Table 42.2 lists all spinor and multivector representations of SU(N) and Spin(N), along with the Dynkin 
coordinates ' of the highest vector (which defines the representation), the dimension, quadratic Casimir in- 
variant C2, and Dynkin index Sə of the representation. Other representations, not listed, are representations 
of irreducible components of tensor products of spinors and/or multivectors. Beware that the Casimir invari- 
ant and Dynkin index are proportional to charge squared, and their numerical values depend on the units 
of charge adopted, which may vary between authors. The Casimir invariant and Dynkin index in Table 42.2 
are in charge units such that the separation of adjacent charges on the charge lattice is unity, Figure 42.1. 

Spin(2n+1) has one spinor representation, while Spin(2n) has two, which are conjugates of each other. 
Spin(JV) has a representation for multivectors of grade p, and the same representation holds for their pseudo 
partners, multivectors of grade N—p; except that for Spin(2n) and grade p = n there are two representations, 
each containing half of the grade-n multivectors, one representation being the pseudoscalar times the other. 
The adjoint representation is the bivector representation, grade p = 2. 

SU(N) is the subgroup of Spin(2N) (or of Spin(2N+1)) that preserves the total charge (total spin weight, 
or number of up bits, in the language of the super geometric algebra, §38.2). The spinor representations of 
SU(N) are characterized by their spinor grade, the number of up bits of the spinor. For example, SU(5) has 
spinor representations of spinor grades 0 to 5, as listed in the Spin(10) chart (42.12), with dimensions 1, 
5, 10, 10, 5, 1. SU(N) has a multivector representation for each even grade 2p, consisting of the subset of 
Spin(2N) multivectors of grade 2p that have zero charge (zero spin weight). The adjoint representation is 
the bivector representation, grade 2p = 2. 

Not included in Table 42.2 is the simplest of all Lie groups, the group U(1) of dimension 1. Whereas 
generators of groups of dimension 2 or more are normalized naturally by setting the separation between 
adjacent charges on the charge lattice to 1 (Figure 42.1), the group U(1), having only 1 charge, has no such 
natural normalization. Yet the charges of the Uy(1) hypercharge and Uem(1) electromagnetic groups do 
come in discrete increments, leading to the commonly adopted empirical normalizations of hypercharge and 
electric charge listed in the Table 42.1 of SM charges. A “natural” normalization of a U(1) charge may emerge 
if it is embedded in a larger unifying group such as Spin(10). Regardless of the choice of units of charge, the 
Casimir invariant C2 and Dynkin index S2 of U(1) are dimensionful quantities equal to the square of the 
U(1) charge, equations (42.40) and (42.41). 


Exercise 42.1. Representations of Spin(3). The group Spin(3) of rotations in 3 spatial dimensions is the 
simplest irreducible Lie group with non-vanishing commutators. 

1. Use the Cartan-Weyl-Dynkin approach to find all representations of Spin(3). 

2. Given that Spin(3) is isomorphic to SU(2), is there any difference between their representations? 
Solution. 

1. The group Spin(3) has 3 generators, which in an orthonormal basis are + Yad] with indices a and b drawn 
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from 1, 2, 3. The orthonormal generators are commonly denoted by angular momentum operators La 
with a = 1,2,3 (ora = x,y, 2), 


iLi = 573), iL = $M, iL = Fp - (42.65) 
The commutators of the orthonormal generators La are 
[La, Lo] = 1€abcLc ’ (42.66) 


with £abe the totally antisymmetric symbol. In a chiral basis, the generators are the diagonal generator 
typy and the raising and lower generators $1113] and iyis- The chiral generators are 


L3 = 5p, Lr zl +iL2) = -33> L-= Z — iL2) = 473] - (42.67) 
The commutation rules of the chiral generators are 


[L4,L-]=L3, [Ls, L4] = +L4 , (42.68) 


in agreement with equations (35.137). Spin(3) has a single conserved charge, the eigenvalue m of La, 
the component of angular momentum about the 3-axis. A representation of Spin(3) is labelled by the 
single Dynkin integer coordinate of its highest vector, At = 24. The Casimir invariant (42.58) is 


C20 = o£ +1), (42.69) 


the total angular momentum of the representation. The dimension of a representation of Dynkin coor- 
dinate 22 is 


dim(20) = 2¢+1. (42.70) 
The Dynkin index (42.39) of a representation is 
So(2£) = (L+ 1)(224+1). (42.71) 


The smallest non-trivial representation is the spinor representation £ = 5; which has dimension 2@+1 = 2. 
The representation of the angular momentum operators in that case are 4 the Pauli matrices, 


(L3)ij = 903, (La)iy = 304. (42.72) 


. Spin(3) is isomorphic not only to SU(2) but also to Sp(2). The isomorphism is evident from the fact that 
the diagrams for all three groups are the same, a line joining two points. So yes, their representations 
are the same. However, it is necessary to worry about units. The Dynkin metric is a 1 x 1 matrix, but 
with different normalizations, 


+ SU(2) 
gg =(1)x4 Z Spin(3) . (42.73) 
1 Sp(2) 
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The quadratic Casimir invariants (42.58) differ correspondingly, 


2 su(2) 
C (2) = e(@+1)x 2 1 Spin(3) . (42.74) 
4 Sp(2) 


The reason for the difference is that the Dynkin metric and the quadratic Casimir invariant are propor- 
tional to the square of the separation between adjacent charges, which, as can be seen in Figure 42.1, 
is 2 (black line) for SU(2), 1 (blue line) for Spin(3), and 4 (purple+blue line) for Sp(2). The separation 
of charges is a matter of units. In rotations in 3 dimensions, there is a single conserved charge, the 
angular momentum Lg about the 3-axis. It is natural to adopt units such that a change by one unit in 
the charge lattice (the vertical blue line in Figure 42.1) corresponds to one unit of angular momentum; 
indeed this is so if angular momentum is measured in natural units A. In weak interactions on the other 
hand, there are two weak charges, d and u, and it is natural to choose the charge separation such that 
d and u change by 1 in a weak interaction (the 45° black line in Figure 42.1). In those units, the correct 
normalization of the Casimir invariant (42.74) for weak interactions is the SU(2) normalization. 


Exercise 42.2. Prove that the group Spin(4) is isomorphic to SU(2) x SU(2). 
Solution. The Dynkin metric of SU(2) is the 1 x 1 matrix 


Jij = a i). (42.75) 


The Dynkin metric of Spin(4) is the 2 x 2 matrix 


1/1 0 
w=3( o 4 , (42.76) 


which is the block diagonal composition of two SU(2) Dynkin metrics. Therefore Spin(4) is isomorphic to 
SU(2) x SU(2). 


Exercise 42.3. Prove that the group SU(4) is isomorphic to Spin(6). 
Solution. The Dynkin metrics of SU(4) and Spin(6) are 


j= 


SU(4) 


Jij = ; (42.77) 


j=i 


Spin(6) 


Now &rFPY WwW 
FPwWNnN WM FWD 
WrRFNnNMNwWNY FR 


which are related by permuting the first two rows and columns, 1 + 2. Therefore SU(4) and Spin(6) are 
isomorphic. 
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42.3 The nature of mass 


What is mass? Mass remains one of the most mysterious ingredients of the Standard Model (Quigg, 2007). 
In the conventional picture, the chiral (right- or left-handed) fundamental fermions of the SM are taken to be 
natively massless, since chirality is a property only of massless spinors. A massive spinor is a superposition 
of two chiral spinors of opposite chirality, a linear combination of right- and left-handed chiral spinors. A 
massive spinor at rest is an equal superposition of right- and left-handed spinors. For example, in the chiral 
representation (39.12), an electron at rest is ey, = (er — ieL)/V2, while a positron (an antielectron) at rest 
is ey = (— ier + ez) /V2. 

The Spin(10) chart (42.12) of fermions shows that right- and left-handed versions of each species of fermion 
(for example, eg and e) differ by the d-bit. The SM postulates that fermions flip their d-bit as a result of 
interaction with the Higgs field, §42.4.10, giving the fermions their fundamental masses. Spinors that come 
in right- and left-handed versions are called Dirac spinors, and the mass that arises from flipping between 
the massless right- and left-handed components is called Dirac mass. A Dirac mass that results from flipping 
the d-bit is possible only after electroweak symmetry breaking, where d charge is not conserved. 

Table 42.3 shows the measured rest masses of the fundamental fermions, with leptons in the top two rows, 
quarks in the bottom two. The fundamental fermions come in 3 generations, electron, muon, and tauon (or 
1, 2, and 3), each generation repeating the same pattern of charges, Table 42.1, but with different masses. 
The masses follow no clear pattern, except that higher generations are more massive, and neutrino masses 
are substantially smaller than other fermion masses, as illustrated in Figure 42.2. Neutrino masses, and 
their assignment to generation, remain as yet uncertain; neutrino oscillations, §42.3.2, yield mass squared 
differences, and cosmological constraints yield only an upper limit X` m, < 0.12eV on the sum of the three 
neutrino masses, equation (10.110). 

Most of the mass of objects in the familiar world comes not from the masses of fundamental fermions, 
but from protons and neutrons, which are bound states of quarks. Protons and neutrons, along with other 
strongly interacting particles containing an odd number of quarks, are collectively called baryons. Baryons 
themselves combine into nuclei, and thence with electrons into atoms and molecules. A proton is a colourless 
combination wud of two up quarks and one down quark, while a neutron is a colourless combination udd of 
one up quark and two down quarks. Colourless means that the combination is a symmetric superposition of 


Table 42.3: Masses of fundamental fermions (NIST, 2014; Tanabashi et al., 2018) 


Generation 
1 2 3 
e-neutrino Ve ? peneutrino v, 0.0leV? T-neutrino v, 0.05eV? 
electron e 0.510998 946(3) MeV muon u 105.658375(3) MeV tauon r 1.77682(16) GeV 
up u 2.2(5) MeV charm c 1,275(30) MeV top t 173.0(4) GeV 
down d 4.7(4) MeV strange s 95(6) MeV bottom 6 4.18(4) GeV 
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Figure 42.2 Masses of fundamental fermions, Table 42.3. Neutrino masses, and their assignment to generation, remain 
uncertain. 


equal contributions of r, g, and b colours. Numerical calculation of quantum chromodynamics on a lattice 
(lattice QCD) reveals that protons and neutrons should be thought of not as three quarks somehow stuck 
together, but rather as a seething maelstrom of strongly interacting relativistic quarks and gluons bound 
together by the colour force (Yang et al., 2018). The rest masses of the three “valence” uud or udd quarks 
contribute only about 1% of the ~ 1 GeV mass of a proton or neutron. 


42.3.1 Neutrino mass and the see-saw mechanism 


Neutrinos cannot acquire their mass in the same way as the other fundamental fermions, since only left- 
handed meutrinos (and right-handed antineutrinos) are observed. There is no experimental evidence for 
a right-handed neutrino. Evidence from particle accelerator experiments indicates that there are only 3 
neutrino types with masses less than half the mass of the Z neutral weak gauge boson, imz = 45 GeV 
(ALEPH Collaboration et al., 2006), 


N, = 2.984 + 0.008 . (42.78) 


Evidence from the CMB indicates that there are only 3 neutrino types with masses less than about the 
electron mass (the observations set limits on the number of neutrino types post electron-positron annihilation) 
(Aghanim et al., 2018), 


Neg =3.0 0.5. (42.79) 


Yet neutrinos are observed to have (small) masses. How can neutrinos have mass if they are purely chiral? 
A leading idea is the see-saw mechanism proposed by Gell-Mann, Ramond, and Slansky (1979). They 
argued that the right-handed neutrino, alone among all the fundamental fermions, could be a superposition 
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of itself vg and its charge conjugate PL. The mass acquired by flipping between a massless particle and 
its charge conjugate is called a Majorana mass (Majorana, 1937). The right-handed neutrino can have a 
Majorana mass because it has no SM charge (no hypercharge Y, no isospin I, and no colour, Table 42.1). 
The right-handed neutrino vp is the all-bit-up spinor [TMT (and its charge conjugate 7, is the all-bit-down 
spinor 44444), which has zero SM charge because the SM excludes the generator i}, Ya A Ya that would 
give vr a charge, equation (42.27). The right-handed neutrino is the only fundamental fermion with zero 
SM charge. The right-handed neutrino could escape observation provided that it has a sufficiently large 
Majorana mass, greater than the electroweak scale ~ 1 TeV. 

Although a right-handed neutrino has no SM charge, it does have lepton number L. A Majorana mass 
that flips between the right-handed neutrino and its charge conjugate, the left-handed antineutrino, violates 
conservation of lepton number. It also violates conservation of the difference B — L of baryon and lepton 
number. SM transformations conserve both baryon B and lepton L number, and Spin(10) conserves B — L, 
though not B and L individually, equation (42.17). Does Nature allow a lepton-non-conserving Majorana 
mass? That is a secret that at present only Nature knows. But if it does, then out-of-equilibrium decay 
of three generations of right-handed neutrino in the early Universe could lead to an excess of leptons over 
antileptons, a process called leptogenesis (Fukugita, 1986; Buchmiiller, Peccei, and Yanagida, 2005; Davidson, 
Nardib, and Nir, 2008; Blanchet and Di Bari, 2012; Fong, Nardi, and Riotto, 2012; Drewes, 2013; Cline, 2018). 
Leptogenesis can subsequently promote baryogenesis at the electroweak phase transition. 

Gell-Mann, Ramond, and Slansky (1979) proposed that neutrinos, alone among the fundamental fermions, 
acquire both kinds of masses, a Majorana mass M that flips the right-handed neutrino and its charge 
conjugate into each other vy © Du, or equivalently Vys > Vin and Vy, Vy, and a Dirac mass m that 
flips right- and left-handed neutrinos into each other, vp + v, or equivalently Vys © Vyr and 4, Levy: 
The result is that neutrino spinors are coupled to each other by a Hermitian mass matrix M that, in the 
chiral representation (39.13), is, for spin up + spinors, 


0 -m 0 =M V+ 
ae E m o0 0 0 v 
vi Mv = i( Very Vip YVyt Vut ) o 0 0 : a ; (42.80) 
M 0 -m 0 Vor 


The same mass matrix M holds for spinors of the same chirality but spin down | instead of spin up f. 
The signs and normalization of equation (42.80) stem from the fact that the Dirac mass term is mD -v = 


—imv'yov, equation (39.99). The mass matrix M has 4 eigenvalues +m and +m_ with 


M M\? 
=r 4 2 42.81 
mame ta (F) me, (42.81) 
satisfying m4m- = m?, or equivalently 
V e (42.82) 
m M4 


The condition (42.82) is called the see-saw condition. The mass eigenstates v+ and their antiparticles 7+ are 
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related to the chiral eigenstates by a unitary matrix, 


M4 V4 1 ia a —i Vy+ 
m P 7 1 —ia 1 i a Vys (42 83) 
—m_ D E 2(1 + a?) —a i 1 ia Viv , i 
-m4 Dy —i —a —ia 1 Vor 
where 
je, (42.84) 
m M4 


If the Majorana mass M is much larger than the Dirac mass m, then the large mass m approximates the 
Majorana mass and is much larger than the Dirac mass, m} ~ M >> m, while the small mass m_ is much less 
than the Dirac mass, m_ ~ m?/M « m. For example, if the muon neutrino has mass m_ = My, © 1072 eV 
and the Dirac mass of the muon neutrino approximates the mass of the muon, m ~ m, œ% 100 MeV, then the 
Majorana mass of the right-handed muon neutrino is mẹ} ~ 10° GeV, well above the electroweak symmetry 
breaking scale, and large enough to make the right-handed neutrino inaccessible to current experiment. 

If the Majorana mass M is zero, which is true for fundamental fermions other than the neutrino, then 
the two masses m+ degenerate to the same Dirac mass, m+ = m. The two degenerate mass eigenstates 


correspond to spin up and down versions of the same spinor, and the negative mass eigenstates are their 
antiparticles; for example the electron ey+ and epy, and its antiparticle the positron ey; and ey. 


42.3.2 Neutrino oscillations 


A remarkable property of fundamental fermions is that weak eigenstates are misaligned with mass eigenstates. 
This is true for neutrinos and quarks, and it could well be true also for the charged leptons (electrons, muons, 
tauons). The weak eigenstates are often called flavours, to distinguish them from mass eigenstates. 

The misalignment of weak and mass eigenstates is evidenced most spectacularly by oscillations between 
the three generations of neutrino (Xing, 2020). Weak eigenstates vw, w = e, u,T of neutrinos are linear 
combinations of mass eigenstates vi, i = 1, 2,3, 


Vw = >> Vath , (42.85) 


where Uwi is a unitary matrix called the Pontecorvo-Maki-Nakagawa-Sakata (PMNS) matrix (Pontecorvo, 
1958; Maki, Nakagawa, and Sakata, 1962; Gribov and Pontecorvo, 1969). When a neutrino is created, for 
example by the decay of a pion 7+ + fi + vp, it is created as a result of a weak interaction in a definite 
weak eigenstate, in this example a muon neutrino v,. But that weak eigenstate is a superposition of 3 mass 
eigenstates, which propagate with slightly different frequencies and wavevectors. When the neutrino is then 
detected some distance from its creation, it has oscillated into a superposition of weak eigenstates, and may 
be detected as a different weak eigenstate from the one in which it was created. 

Neutrino oscillations result from interference between mass eigenstates. The condition for detectable in- 
terference between a pair (or more) of propagating waves is that they differ slightly in frequency w and/or 
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> 


wavevector k, 


bw<w, |ôk|<« Ikl, (42.86) 
or equivalently in energy E and/or momentum p, 
E&E, |dp|< ipl. (42.87) 


Neutrinos, whether created in the Sun or in a particle acclerator, are typically highly relativistic, and naturally 
satisfy the conditions (42.87). By contrast, charged leptons are typically not highly relativistic, and moreover 
are constantly interacting electromagnetically with other charged particles in the environment, decohering 
them into one or other definite mass eigenstate. 

As discussed by Kayser (1981), neutrino oscillations would be destroyed if the energy-momentum of the 
neutrino were measured at source sufficiently accurately to determine its mass eigenstate, in much the same 
way that the interference pattern from a two-slit experiment is destroyed if the wave/particle is located 
with sufficient accuracy to determine through which slit it passed. For example, the mass of the neutrino 
in pion decay 7+ + ji + v, could be determined by measuring the energy-momenta of the pion and anti- 
muon sufficiently accurately. Kayser (1981) concludes that a necessary condition for neutrino oscillations is 
a distribution of neutrino energy-momenta broad enough to admit multiple mass eigenstates. An estimate 
of the minimum range in energy-momentum comes from assuming that the eigenstates have the same en- 
ergy, in which case the difference in their momenta is dp = V E2 — m? ~ —6(m?)/2E in the relativistic 
approximation E >> m. The resulting minimum range of momentum is 


ISPs BUN, (42.88) 
p 2p? 

A basic tenet of quantum field theory (qft) is that interactions occur at points of spacetime, and that 
fields propagate as waves between those points. The interaction at points means that a neutrino can be 
considered to be created at the origin at time zero, and then detected at position {t, 7}. These creation and 
detection points are not necessarily known nor unique; qft demands integrating over whatever is not known or 
specified. What is meant by interactions happening at spacetime points is that an entire neutrino, including 
all its mass components, is created at a spacetime point, and then an entire neutrino, including all its mass 
components, is detected at another spacetime point. In classical mechanics, a particle moving on a straight 
line from the origin to {t,7} has energy-momentum vector proportional to the spacetime distance along 
the line, {E, p} « {t,£} (with constant of proportionality mass over proper time, m/r). In qft by contrast, 
waves of all energy-momenta are permitted between interaction points. The classical energy-momentum is 
merely the most probable of a range of possibilities. 

A plane wave of a mass eigenstate į with energy-momentum {F;,p;} that propates over spacetime distance 
{t,Z} changes by a quantum mechanical phase factor e’® with phase ¢; = —E;t+;-. The phase difference 
21 = Q2 — Qı between two mass eigenstates 1 and 2 is 


Q21 = -Eat +P T, (42.89) 


with E21 = Ey — Ey and po, = P2 — pi. Akhmedov and Smirnov (2009) give a careful exposition of the 
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evaluation of the phase difference ¢2;. The first point is that the neutrino travels many wavelengths from 
creation to detection. For example, a 1 MeV relativistic neutrino has a wavelength of hc/MeV ~ 1071? m, 
which is tiny compared to the distances of kilometers and more over which neutrino oscillations are measured. 
Consequently each neutrino mass eigenstate is well approximated as a plane wave with momentum p; aligned 
with the direction g, that is, the transverse components of momentum can be neglected. Moreover, by the 
time it has travelled many wavelengths, each mass eigenstate is to an excellent approximation on-shell, 
meaning that the energy and momentum of a mass eigenstate of mass m; are related by E; = yp? + m7. 
The second point is that each mass eigenstate i should be described by a wavepacket with a small but 
finite range of momentum p; about some central momentum p;. The group velocity of the wavepacket is 
v = JE; /OPi|),<5,- The third point is that the velocity v = x/t between creation and detection equals 
the mean group velocity 0 = 4 (vy + v2) with an uncertainty of order the difference v21 = v2 — vı of group 
velocities, v = U + O(ve1). Under these conditions, the phase difference between the mass eigenstates is 


21 = — . (42.90) 


—m2 2 2 
ġa = A7 ae (3 + 2a) +0 (>) , (42.91) 
2p 2p p p 


where a, a number of order unity, measures the departure of the spacetime velocity from the group velocity, 
v = + ava). Equation (42.90) implies that the wavelength of a neutrino oscillation is 


d= SF = 248km aa. (aay) i ea) 


M21 


To illustrate how the calculation of neutrino oscillations works out, consider the example of just two 
neutrino eigenstates. The unitary matrix (42.85) is then a 2 x 2 matrix. Three arbitrary phases can be 
absorbed into a rephasing of the weak and mass eigenstates v,, and v;, which reduces the matrix without 
loss of generality to 


cos sin@ 
Uwi = . : 42.93 
( —sinð cos ) ( ) 


The quantum-mechanical amplitude for a neutrino created in weak eigenstate v,, to be detected as the weak 
eigenstate Vw is 


(yt) = X (wwr [rije ® (vilou) = Syne Uo ; (42.94) 
i i 
where ¢ġ; is the change of phase of mass eigenstate i from creation to detection. The probability that the 


initial weak state v is detected as the other weak state vw is the square of the amplitude (42.94), which 
simplifies to 


P(vw > Vw) = |(Vry Vu)? = sin? 26 sin? $2 , (42.95) 
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where ¢91 is the phase difference of the mass eigenstates, equation (42.90). The probability of no change in 
the weak state is 1 minus the probability (42.95) of a change, 


P(w > Vw) = P(Vw > vw) = 1— Pw > vw). (42.96) 


42.4 The Dirac and SM algebras are commuting subalgebras of the Spin(11,1) 
geometric algebra 


Grand unified theories such as SU(5) or Spin(10) unify three of the four known forces of nature. The 
fourth force is gravity, the gauge theory of the Poincaré group, consisting of spacetime rotations (Lorentz 
transformations) and spacetime translations. An essential feature of the Standard Model is that the SM and 
Poincaré groups are distinct: the two groups act on particles (fields) as a direct product of groups at each 
point of 4-dimensional spacetime. Yet the Spin(10) chart (42.12) of fundamental fermions looks like it knows 
at least about Lorentz transformations. Each species of fermion appears in the chart as four components (an 
electron for example appears as er, €L, Er, and é,), that are ordinarily distinguished from each other by 
their behaviour under Lorentz transformations. 

The intent of this section is to explore how Poincaré transformations might mesh with the Spin(10) GUT 
group, or equivalently how the Lie algebras of the two groups might combine. When the Poincaré group 
is extended to spinors, the resulting Lie algebra is the algebra of Dirac y-matrices. Similarly, the algebra 
associated with the SM contains more than just the bivector generators of the SM group. There are also 
generators associated with the mysterious Higgs field, which the SM invokes to flip the d-bit of a fermion, 
thereby flipping fermions of the same species between their right- and left-handed chiral components, for 
example er < eL. Such a flip is necessarily generated by an odd multivector in the Spin(10) geometric 
algebra. And of course the SM contains spinors, and a scalar product of spinors. If Spin(10) is the GUT 
group, then the associated relevant algebra is not merely the Lie algebra of the Spin(10) group, but the full 
super geometric algebra associated with Spin(10). 

The question then becomes, is the Dirac algebra a subalgebra of the Spin(10) geometric algebra, such that 
the generators of Poincaré and SM transformations commute as required by the SM? An immediate obstacle 
to embedding the Dirac algebra in the Spin(10) algebra is that the Dirac algebra contains a time dimension 
whereas the 10 dimensions of Spin(10) are spacelike. This obstacle may be overcome by adjoining a pair of 
extra dimensions, one of them timelike, to the 10 spacelike dimensions of Spin(10), §42.4.3, enlarging the 
group to the group Spin(11,1) of transformations in 11+1 spacetime dimensions. 

A well-known no-go theorem (Coleman and Mandula, 1967; Mandula, 2015) states that, subject to some 
plausible conditions, any symmetry group of the scattering matrix must be a direct product of the Poincaré 
group and an internal symmetry group. The Coleman-Mandula theorem does not apply here because what 
is being considered is a symmetry of the Lagrangian that is, somehow, broken, and therefore not necessarily 
manifest in scattering experiments. 

Percacci (1991) (see Nesti and Percacci (2008)) has previously proposed that the SM GUT group SO(10) 
and the Lorentz group SO(3, 1) are unified in SO(13, 1). 
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42.4.1 Striking and puzzling features of Spin(10) spinors 


The Spin(10) chart (42.12) of fundamental fermions exhibits some striking features. The most prominent 
striking feature is that the Spin(10) handedness coincides with the handedness, or chirality (R or L), of the 
spinor under Lorentz transformations. The Spin(10) handedness of a spinor is the sign of the spinor under 
the action of the Spin(10) chiral operator 19, while chirality under Lorentz transformations is the sign of 
the spinor under the action of the Dirac chirality operator traditionally denoted ys. Mathematically, the 


ae a Pe 
coincidence (signified =) is 


z ? . i 
I = igs = WURI = ho = ixo = Vd Va Va Ya Vr Ve Ven Vo Ve - (42.97) 


An essential property of the SM is that the Poincaré and SM groups commute, which implies that their Lie 
algebras are distinct (combine as a commuting product). If the GUT group is Spin(10), and if the full Dirac 
and Spin(10) geometric algebras are assumed to be distinct, then the Dirac and Spin(10) chiral operators 
y5 and 219 would be distinct elements of the product algebra. But the fact that the y5 and 219 operators 
yield the same result in all cases suggests the alternative hypothesis that y5 and x 9 are in fact identical, 
and that the spacetime Ym (m = 0, 1,2,3) and SM q7 (i =d,u,r,g,b) vectors are related, not distinct. 

A second provocative feature of the Spin(10) chart (42.12) is that SM transformations are arrayed vertically, 


whereas the 4 components of fermions of the same species, such as electrons eg and er and their positron 
partners @ and ér, are arrayed (mostly) horizontally. SM transformations are vertical because the columns 
of the chart are SU(5) multiplets, and SU(5) contains the SM group Uy (1) x SUL (2) x SU(8). In Dirac theory, 
a Dirac spinor such as an electron has 4 complex components that are distinguished by their properties under 
Lorentz transformations. The electron, for example, is a complex linear combination of 2 right-handed Weyl 
spinors ey+ and ey, and 2 left-handed Weyl spinors ey+ and ey}, that are distinguished by a boost bit 
V or U and a spin bit ¢ or |. The boost and spin bits prescribe how the spinors transform under Lorentz 
transformations. The juxtaposition of vertical SM and horizontal Lorentz transformations in the chart (42.12) 
again signals that somehow Spin(10) incorporates both. 

A third striking feature of the Spin(10) chart (42.12) is that flipping the d-bit preserves the identity of the 
spinor but flips its chirality; for example the electron is flipped er © er. 

The Spin(10) chart (42.12) also presents puzzles. In any supergeometric algebra with even dimensions, the 
spinor metric flips all bits. The number of bits is half the number of dimensions. In Dirac, the number of bits, 
2, is even, so the spinor metric preserves chirality. In Spin(10), the number of bits, 5, is odd, so the spinor 
metric flips chirality. As pointed out in equation (42.97), Spin(10) chirality happens to coincide with Dirac 
chirality. So it would seem that the Spin(10) spinor metric is inconsistent with the Dirac spinor metric. This 
puzzle is resolved in §42.4.3, which argues that, to accommodate a time dimension, two extra dimensions, 
and correspondingly one extra bit, must be adjoined to the algebra. 


42.4.2 Dirac chart 


It is useful to start by writing out a chart of Dirac spinors in a form analogous to the Spin(10) chart (42.12). 
Dirac spinors live in Spin(3, 1), and they have two bits, a boost bit V (up) or U (down), and a spin bit + 
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(up) or | (down). Electrons for example fill out the following Dirac chart, organized by the number of up 
bits of the spinor: 


Ul: ër Ut : e Vt : eR (42.98) 


A Dirac spinor is right- or left-handed as the number of up bits is even or odd. The expressions (39.35) 
and (39.36) for the Dirac spinor metric show that the spinor metric connects spinors of opposite charge, 
opposite spin, and opposite boost. The expressions (39.83) and (39.85) for the Dirac conjugation operator 
show that Dirac conjugation flips charge and spin, but not boost. Consistent with the action of the Dirac 
spinor metric and conjugation operator, the spinors in the chart (42.98) with spin up (fT) are labelled electrons 
e, while those with spin down (1) are labelled positrons @. 

The chart (42.98) raises an immediate issue of interpretation. The expressions (39.14) for the bivector 
generators of Lorentz boosts and spatial rotations show that Lorentz transformations rotate the components 
of like-chiral Dirac spinors into each other, for example Vt © UJ. But the chart (42.98) would seem to show 
that the two components of like-chiral spinors have opposite charge, for example Vt is er while UJ is ép, 
and therefore cannot be rotated into each other since that would violate conservation of charge. 

The resolution of this apparent contradiction is that chiral Dirac spinors are massless, and cannot unam- 
biguously be assigned a charge. Only massive Dirac spinors, which are linear combinations of opposite-chiral 
spinors, carry definite charge. A massive electron e, and a massive positron ey are linear combinations of 
the same pair of opposite-chiral spinors, with opposing phases, from equation (39.23): 


evt — teut eU} — iev} 
ee EE ea N 42.99a 
m= a ue a SAE 

; ev+ + tey : ey, + tev 
ieyt = 5 , typ = ae : (42.99b) 


A massive electron ey requires all 4 chiral spinors for its description, and a massive positron ey requires 
the same set of 4 chiral spinors. Equations (42.99) show that what distinguishes electrons and positrons is 
that they are (modulo overall phases) complex conjugates of each other. Complex conjugation is a discrete 
operation that cannot be accomplished by any continuous Lorentz transformation. 

The essence of electromagnetism is that spinors of opposite charge transform with opposite phase under a 
Uem(1) electromagnetic gauge transformation (41.25). The charge of an electron or positron is unambiguous, 
and the complex conjugate of an electron is unambiguously a positron, per equations (42.99). But a chiral 
spinor such as ey+ can be obtained as either an electron or a positron in the limit of diverging boost and 
vanishing mass, so ey+ could have either charge. By itself, a massless, chiral spinor does not contain enough 
information to determine the sign of its charge. To disambiguate their charge, the chiral spinors in the Dirac 
chart (42.98) are written without or with a conjugation 
charge of an electron, while conjugated spinors (with *) have the charge of a positron. So disambiguated, 
the charges of the chiral spinors in the Dirac chart (42.98) can be read off from their bits: spinors with spin 


* symbol: unconjugated spinors (no *) have the 
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bit up (t) are electrons, while spinors with spin bit down (|) are positrons. If the chart (42.98) is complex 
conjugated, then all charges are flipped. 

Charge is invariant under Lorentz transformations. Mathematically, the generator of a Uem(1) electromag- 
netic transformation should commute with generators of Lorentz transformations. But in the chart (42.98), 
the apparent Uem(1) electromagnetic generator iQ coincides (up to a factor) with the generator Io3 of a 
spatial rotation about the 3-axis. The charge operator Q œ Io3 anticommutes rather than commutes with 
the other two generators Io, and Io of spatial rotations. This anticommutation is in fact correct, because 
the spatial rotation generators Ioa, a = 1,2 flip spin + © |, which flips the charge Q assigned by the Dirac 
chart (42.98), whereas spatial rotation leaves the actual charge unchanged. Thus the process Io, Q of measur- 
ing the charge Q then rotating to the spin-flipped spinor should coincide with the process —Q Io, of rotating 
to the spin-flipped spinor then measuring minus the charge Q (that is, the charge of the complex-conjugated 
spinor) assigned by the Dirac chart (42.98). The correct commutation of Q with spatial generators Ioa is 


logQ=-QIoq (a=1,2), Io3Q=QIoz3. (42.100) 


Another approach to imposing commutation of electromagnetic and Lorentz generators is to unconjugate 
all the conjugated spinors in the Dirac chart (42.98). Equivalently, modify the electromagnetic generator so it 
measures the charge of the unconjugated version of each spinor. In the Dirac chart (42.98), the modification 
is accomplished by replacing i in the electromagnetic generator iQ by the generator Io3 of a rotation about 
the 3-axis, 


iQ > Io3Q , (42.101) 


in effect turning the electromagnetic generator into the unit operator, which of course commutes with all 
Lorentz transformations. The modification trivializes the electromagnetic generator for the Dirac chart (42.98) 
but that is because there is only one charge and only one generator. In the Spin(10) case, there are several 
charges and many gauge generators, and the corresponding modification to the gauge generators, equa- 
tions (42.113), is not trivial. 


42.4.3 An eleventh, and twelfth, dimension, and a sixth bit 


An obvious hurdle to uniting the Dirac and SM algebras in a common Spin(10) GUT algebra is that the 
Dirac algebra has a time dimension, but Spin(10) does not. 

To fix the problem, consider adding a single extra dimension, a time dimension, to Spin(10). Super ge- 
ometric algebras live naturally in even dimensions. As discussed in parts 7 and 10 of Exercise 38.3, there 
are two approaches to adding an extra odd, here 11th, dimension to a super geometric algebra. The first is 
to project the 11-dimensional algebra into one lower dimension; the second is to embed the 11-dimensional 
algebra in one higher dimension. The first approach, projecting into one lower dimension, requires identifying 
the 11-dimensional chiral operator with unity, %11 = 1, in which case the 10-dimensional pseudoscalar Jio 
behaves like a timelike 11th dimension. This option is excluded because the putative time dimension yo = Jo 
commutes with the pseudoscalar J10, in contradiction to the Dirac algebra, where the Dirac time dimension 
“Yo anticommutes with the Dirac pseudoscalar I = yoy1Y2%3- 
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The second approach to adjoining an extra, 11th, dimension, described in part 10 of Exercise 38.3, is to 
add not one but two additional dimensions %11 and Ņ12, and to treat the extra 12th dimension as a scalar. By 
scalar is meant that for some reason, perhaps symmetry breaking, the 12th dimension does not participate 
in the rotational symmetries connecting the other 11 dimensions. 

Adding two extra dimensions works. But as the exposition unfolds, it will be seen that the 12th dimension 
participates fully in the algebra. The evidence points toward the 12th dimension being a genuine extra 
dimension, not merely a scalar. Notably, the electroweak Higgs field emerges naturally as a bivector generator 
involving the 12th dimension, equation (42.124). 

Adding two extra dimensions adjoins an additional, 6th, T-bit, or time bit, to the 5 durgb bits of a Spin(10) 
spinor. Like the other 5 bits, the T-bit of a Spin(11, 1) spinor takes the values +}, equal to the spin weight 
of the spinor under rotations in the 1; A712 plane, part 2 of Exercise 38.3. In Dirac theory, spinors and 


antispinors are complex conjugates of each other, and massive spinors at rest are eigenfunctions of the time 
axis. These two conditions require interpreting the 12th dimension, not the 11th dimension, as providing the 
time dimension ‘yo. It is convenient to denote the 11th and 12th dimensions using the same notation as the 
other SM vectors, equations (42.21), 


Yr Fe : = VE VT 
Y= = F Yo = 112 = Yr = F (42.102) 


42.4.4 The spinor metric and the conjugation operator 


Any super geometric algebra contains two operators, the spinor metric £, and the conjugation operator C, 
that are invariant under rotations. A consistent translation between Dirac and Spin(11,1) representations 
must agree on the behaviour of these two operators. 

The Dirac spinor metric € and conjugation operator C are respectively antisymmetric and symmetric. 
Consistency requires that the Spin(11, 1) spinor metric and conjugation operator be similarly antisymmetric 
and symmetric. Consultation of Tables 38.1 and 39.1 shows that in 11+1 dimensions only the standard choice 
€ of spinor metric and associated conjugation operator C' possess the desired antisymmetry and symmetry. 
If one of the dimensions is a scalar, then in 10+1 dimensions both the standard £ and alternative £a choices 
of spinor metric, and the associated conjugation operators C and Cam, possess the desired antisymmetry and 
symmetry; the tilde’d spinor metrics and conjugation operators have the wrong symmetry, and are excluded. 

The choice that works in both 10+1 and 11+1 dimensions, and that permits seamless translation between 
the Dirac and Spin(11,1) algebras is, as in the standard (3+1)-dimensional Dirac algebra, the standard 
spinor metric 


ates Pa Via ta Pa YE. (42.103) 

Below it will be found that the representation of the spatial rotation generator Jo2, equation (42.111), 

coincides with the representation of the spinor metric (42.103), which is similar to the coincidence (39.36) 
between Ja, and the spinor metric € in the chiral representation of the Dirac algebra. 

Given the Spin(11, 1) spinor metric (42.103), and with the time axis yo = iy7, the Spin(11, 1) conjugation 
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operator is 

C = —ieyo = EYF = VUE GWEN - (42.104) 
Again, the choice (42.104) works in both 10+1 and 11+1 spacetime dimensions. Whereas the Spin(11, 1) 
spinor metric (42.103) flips all bits, the conjugation operator (42.104) flips all bits except T, that is, it flips 
durgb. This is the same as the Spin(10) conjugation operator, which flips the five durgb bits. 


42.4.5 Translation from Spin(11,1) to Dirac representation, Part 1 


The Spin(10) chart (42.12) can now be promoted to Spin(11, 1), and translated into the Dirac representation. 

The conventional interpretation of the Spin(10) chart (42.12) is that each spinor is a Weyl spinor with 
2 complex components (4 components altogether). For example, the right-handed electron er is the Weyl 
spinor with complex components ey+ and ey,. The conventional interpretation is tantamount to assuming 
that the Dirac and Spin(10) algebras are distinct. The present approach explores instead the alternative 
hypothesis that the Dirac and Spin(10) algebras are related non-trivially. 

After electroweak symmetry breaking, flipping the d-bit flips spinors between right- and left-handed Dirac 
chiralities of the same species, for example er + ey. Massive spinors are linear combinations of the two 
chiralities. Since massive spinors have definite spin, either ¢ or |, flipping the d-bit must flip the Dirac boost 
bit while preserving the spin bit, for example, ey; < eur. 

In the Dirac representation, conjugation flips spin while preserving the boost bit, equations (39.98). 

These conditions, that flipping the d-bit flips boost V «+ U while conjugation flips spin + |, suffice to 
determine the translation between Dirac and Spin(11,1) spinors of the same species (electrons, for exam- 
ple), but they do not fix the translation across different species. The translation across different species is 
determined by the condition that Lorentz transformations commute with SM transformations. In the Dirac 


representation, after electroweak symmetry breaking, a boost by rapidity 0 in the V-U boost plane boosts a 
0/2 


spinor by a real number e*’/*, while a spatial rotation by angle 0 in the f-ļ spin plane rotates a spinor by 


a phase e*'9/?. In the Spin(10) geometric algebra there are two mutually commuting generators that trans- 
form all spinors by a boost or phase and also commute with all SM transformations, namely the electroweak 
pseudoscalar Iqu and the colour pseudoscalar I,g) defined by 


Iau = Y a Yu = -adu = Ya NY3 NYu NNa » (42.105a) 
Ingo = Ve Ve VEV VN = —iArgo = —iYr N Yr N Yg NYa N Yo NTa - (42.105b) 
The electroweak pseudoscalar Iqu changes sign when an odd number of du bits are flipped, while the colour 
pseudoscalar pgb changes sign when an odd number of rgb bits are flipped. The pseudoscalars Ig, and Irgo 


can therefore be interpreted as generating respectively a Lorentz boost and a spatial rotation. The product 
of the commuting boost and rotation operators Iqu and Ipga is the Spin(10) pseudoscalar Jo, 


Tho = Laulrgh = irao - (42.106) 


In Dirac theory, the equivalent product of commuting boost and rotation operators yo*y3 and Ņy1%2 is the 
Dirac pseudoscalar I = yoyıYy2%3. So it would seem that the identification of Iau and Ig) as boost and 
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rotation operators recovers the striking coincidence (42.97) between Dirac and Spin(10) pseudoscalars, an 
encouraging result. 

However, there is a hitch to identifying Ig, as generating a boost, which is that the time axis yo = iyp 
commutes with Jio, which is incompatible with the Dirac algebra, where the time axis Yọ anticommutes with 
the pseudoscalar I = yoy1Y2Y3. The solution is to multiply the boost operator Iqu by iy YT. so that the 
boost operator becomes —ilqur = —ilau Yt YT, 


— ilaur = iY] Va Va Va Ve Ve = Haut = —Ya NYa NYu NYa NIT AG - (42.107) 


The factor YEYT cannot be adjoined to the rotation operator Igp because the resulting algebra turns out 
not to have the correct commutation rules. Appending the factor iyi yT to the boost operator Iqu has the 
consequence that spinors of opposite T-bit then have opposite boost, which allows spinors before electroweak 
symmetry breaking to be linear combinations of T-up and T-down spinors and therefore be massive, §42.4.14, 
similarly to the way that after electroweak symmetry breaking massive spinors are linear combinations of 
d-up and d-down spinors with opposite boost. 

The resulting pseudoscalar is not the 10-dimensional pseudoscalar Jio, but rather the 12-dimensional 
pseudoscalar J = —il12, 


J = ~ilo = —ilaut Irgo = iY} Ya Va Va VET Vr Yr Vg Vo Vo Vo 
= 1412 = iya \ Va Yu \ Ya NYT AYPA Ye AYAY NYG N YON YG - (42.108) 


It is J, not Jio, that should be identified with the Dirac pseudoscalar I. The pseudoscalar J squares to —1, 
like the Spin(10) and Dirac pseudoscalars J}o and J. The 12-dimensional chiral operator %19 analogous to 
the Dirac chiral operator y; = —iI is 


12 = —iJ = —Í2 . (42.109) 


Notice that the boost and rotation generators Iqur and Ig) commute with the Uy (1) x SUL(2) x SU(3) 
transformations of the SM, but not with SU(5) transformations. As long as spacetime is 4-dimensional and 
Iaur and I gb generate Lorentz transformations that commute with internal transformations, SU(5) cannot 
be an internal symmetry. 

In the Spin(11,1) chart (42.110) below, in addition to being labelled by its Dirac boost (V or U) and 
spin (f or |), each spinor is labelled by its weak (du) chirality r or l, per the weak chart (42.9). The 
reason for appending the weak label r or l is that spinors that are of the same species after electroweak 
symmetry breaking split into two separate species before electroweak symmetry breaking. For example, 
electrons split into distinct right- and left-handed weak electron species e, and e; that respectively do not 
and do experience the weak force. Weak right-handed r spinors have zero left-handed isospin I, (d- and u-bits 
aligned, equation (42.13b)), and therefore do not experience the SUL(2) weak force, while weak left-handed 
l spinors have non-zero left-handed isospin J; (d- and u-bits anti-aligned), and do experience the weak force. 
Weak chirality r or l is to be distinguished from Dirac chirality R or L, which in the present construction 
coincides with Spin(11, 1) chirality, equation (42.109). 

The Spin(10) chart (42.12) thus translates into the following Spin(11,1) chart, expressed in a form com- 
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patible with the Dirac representation of spinors: 


0 1 2 3 4 5 
* * C* C* 
Vrv} d Viv. ., Urvy ., WU Yt rVt 
=: : K C: z dé: 3," urgb : dur gb 
VU, viy UUL Uy VIVI Vrut 
* * 
elu. Erv} erv €lut 
u: du : rgb: i drgb: 
ev ety, | 9 eru ae ayy (42.110) 
Cc is C * C * 
ci ay de: UT uč: W due: Vt 
rUt IVT IV rU} 
uy, u£ 
uc: WT duc: zV? 
Uivt Urut 


The Spin(11, 1) chart (42.110) contains two spinors for each entry, the upper for T-bit up, the lower for T-bit 
down; the pair differ only in their boost bit V or U. The Dirac boost bit is V or U as xqur is positive or 
negative, that is, as the number of duT up-bits is odd or even. The Dirac spin bit is fî or | as zrg» is positive 
or negative, that is, as the number of rgb up-bits is odd or even. The weak bit is r or l as xq, is positive or 
negative, that is, as the number of du up-bits is even or odd. For spinors with T-bit up, weak chirality r or l 
coincides with Dirac chirality R or L, while for spinors with T-bit down, weak chirality is opposite to Dirac 
chirality. Spinors labelled with the complex conjugation sign * are those identified as charge conjugates in 
the original Spin(10) chart (42.12), the same convention as in the Dirac chart (42.98). Complex-conjugated 


spinors coincide with the spinors with spin bit down |, that is, with zrg negative. 


42.4.6 Translation from Spin(11,1) to Dirac representation, Part 2 


In the Dirac representation, spinors of the same species and Dirac chirality but opposite boost and spin rotate 
spatially into each other; for example, right-handed electrons rotate spatially into each other, ey+ + ev. In 
the Dirac-Spin(11, 1) representation (42.110), a suitable choice of a generator that transforms spinors into 
spinors of the same species but opposite boost and spin is 


Jos = WIEN ee 4 (42.111) 


where J is the pseudoscalar (42.108). Equation (42.111) can be regarded as defining c2; below, equa- 
tion (42.114e), c2 will be identified as a generator of a Lorentz boost. This spatial generator Jo2 anticom- 
mutes with the spatial generator [gp of §42.4.5, consistent with the expected anticommutation of generators 
of spatial rotations. The expression (42.111) for Jog coincides with that for the Spin(11, 1) spinor metric e, 
equation (42.103), but the two are not the same because Joz transforms as a multivector whereas the spinor 
metric £ transforms as a spinor tensor. The coincidence of the expressions for Joz and e is similar to the 
coincidence (39.36) between Joz and the spinor metric £ in the chiral representation of the Dirac algebra. 

Lorentz generators must commute with SM generators, to ensure that SM charges are unchanged by 
Lorentz transformations. However, although the spatial rotation generator Jo2, equation (42.111), does 
commute with all real (in the chiral representation) bivectors (42.26a) of the SM group, it anticommutes 
with all imaginary bivectors (42.26b) and (42.27) of the SM group. 
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The problem is the same as that encountered with the Dirac chart (42.98), which is that the sign of the 
charge of a massless, chiral spinor is ambiguous; only a massive spinor, that is, a linear combination of spinors 
of opposite chirality, has an unambiguous charge. Like the Dirac chart (42.98), the Spin(11, 1) chart (42.110) 
assigns charges in accordance with the Spin(10) generators (42.23) and (42.24), thereby assigning Spin(10)- 
bit-flipped spinors opposite charges. Complex conjugation flips charge. Therefore Joz does in fact have 
the correct commutation rules with SM generators. If S is any of the SM bivector generators, the correct 
commutation rule with Jog is 


Joo S = S* Joo y (42.112) 


Physically, the left hand side of equation (42.112) signifies the operation, apply the operator S then rotate 
to the bit-flipped spinor, while the right hand side signifies, rotate to the bit-flipped spinor then apply the 
complex conjugate of the operator S prescribed by the SM generator. 

An alternative way to check that Joz has the correct commutation rules with SM generators, remarked in 
the last paragraph of §42.4.2, is to modify the SM generators so that they measure the unconjugated charge 
in the Spin(11,1) chart (42.110). The conjugated spinors in the chart (42.110), those labelled with the * 
conjugation symbol, are those with negative colour chirality x;,), as is evident from the colour chart (42.10). 
Therefore SM generators can be modified to measure the unconjugated charge by multiplying imaginary SM 
bivectors by x, ,, which effectively replaces i by I-gp = ig) in the SM bivectors (42.26) and (42.27), 


FINAIS WAV) > Brg VAY — WAY) » (42.113a) 
hiqi Nti > Z Irgo HAN - (42.113b) 


The colour chiral operator rg» has the properties that it commutes with all SM bivectors, and with the 
boost Iagur and spatial rotation Ig, generators, but anticommutes with Jog. Since zrg commutes with 
all SM bivectors, the modification (42.113) of imaginary SM bivectors leaves the SM commutation rules of 
the SM algebra unchanged. The Lorentz generators Idur, Irgp, and Jog commute with all the modified SM 
generators, as required. 


42.4.7 The Dirac algebra as a subalgebra of the Spin(11,1) geometric algebra 


The previous section 42.4.6 argued that, if a translation between Spin(11, 1) and Dirac representations exists, 
then it must take the form (42.110). The Dirac algebra incorporates a full suite of Poincaré transformations. 
Is the Dirac-Spin(11, 1) representation (42.110) consistent with the full suite, in the sense that all Poincaré 
generators commute with all SM generators? This section shows that the answer is yes. 

The generators Joz and Ig), equations (42.111) and (42.105b), and their product constitute a set of 3 
anticommuting generators of spatial rotations that commute with all SM generators. The pseudoscalar J is 
given by equation (42.108). The full set of 6 Lorentz generators, consisting of 3 spatial generators Jo, and 
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3 boost generators Ga, is 


Jor = VAVAE Y Va V 3 (42.114a 
Joz = ete NN >» (42.114b 
Jos = Irgo = Yr Yr Yd Yg Yo Vo > (42.114c 
o1 = iya Va Ir V VV >» (42.114d 
O2 = Ya Vu VE Ve Vg h > (42.114e 
03 = —ilaur = iYi Va Va Va VEIT - (42.114f 


The 6 Lorentz generators all have grade 6. They are not bivectors, but they nevertheless generate Lorentz 
transformations. The 8 basis elements of the complete Lie algebra of Lorentz transformations comprise the 6 
Lorentz generators (42.114) along with the unit element and the pseudoscalar J given by equation (42.108). 
The commutation rules of the elements of the Lie algebra are those of the Lorentz algebra. With the modi- 
fication (42.113) to SM generators, all the Lorentz generators commute with all SM generators. 

Given a time vector Yo and a set of generators o, of Lorentz boosts, spatial vectors Ya can be deduced by 
Lorentz transforming ‘Yo appropriately. Since the boost generators satisfy og = YoYa, Spatial vectors satisfy 
Ya = —Y00a- With the time axis yo = iy; and the expressions (42.114) for og, the full set of 4 spacetime 
vectors Ym İS 


Y =i » (42.115a) 
oaiae ta i Pe Pe ME (42.115b) 
V2 = Ya Yu Vr Vo Yo > (42.115c) 
13 =a Va Va Va YT - (42.1154) 


The vectors (42.115) all have grade 1 mod 4. The multiplication rules for the vectors ym given by equa- 
tions (42.115) agree with the usual multiplication rules for Dirac y-matrices: the vectors +y,, anticommute, 
and their scalar products form the Minkowski metric. All the spacetime vectors 7, commute with all SM 
generators modified per (42.113). The Dirac pseudoscalar I coincides with the Spin(11,1) pseudoscalar J 
defined by equation (42.108), 


I= yyy = J . (42.116) 


Equivalently, the Dirac chiral operator ys = —iI coincides with the Spin(11,1) chiral operator 22 = —iJ. 
Thus the Dirac and SM algebras are subalgebras of the Spin(11,1) geometric algebra, such that all Dirac 
generators commute with all SM generators modified per (42.113). 
The time dimension (42.115a) is just a simple vector in the Spin(11,1) algebra, but the 3 spatial dimen- 
sions (42.115b)—(42.115d) are all 5-dimensional. The spatial dimensions share a common 2-dimensional factor 
yq Yq - Aside from that common factor, each of the 3 spatial dimensions is itself 3-dimensional: y7 yg Y : 


VI Va and yyt yt. 
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42.4.8 Invariance of the spinor Lagrangian 


The spacetime and SM algebra just derived must satisfy a further consistency condition. The spinor La- 
grangian involves a scalar product of spinors with their conjugates, and it must be checked that this scalar 
product is invariant under spacetime and SM transformations. 

The list (39.157) gives the grades of orthonormal multivectors that generate transformations that leave in- 
variant the scalar product of spinors and conjugate spinors. Qualifying generators are real linear combinations 
of orthonormal multivectors of grades (1 or 2) mod 4, and imaginary linear combinations of orthonormal 
multivectors of grades (0 or 3) mod 4. All the spacetime and SM generators in the present construction 
satisfy this criterion. The spacetime vectors ym given by equations (42.115) are real linear combinations of 
orthonormal multivectors of grade 1 mod 4 (the time vector Yo = iy; counts as an orthonormal vector). 
The Lorentz generators (42.114) are real linear combinations of orthonormal multivectors of grade 2 mod 4 
(all factors of i are accompanied by a factor of yp). Recall from the discussion in §42.4.6 that, to ensure 
the correct designation of SM charge, and simultaneously to ensure commutation of SM generators with 
spacetime generators, it was necessary to multiply those of the SM bivector generators that were imaginary 
in the chiral representation by the colour chiral operator x,,,, modification (42.113). Both modified and 
unmodified SM bivectors were real in an orthonormal basis. The unmodified SM bivectors are real in an 
orthonormal basis, and have grade 2. The modified SM bivectors are multiplied by ;g,, which has grade 
6 and is imaginary with respect to an orthonormal basis, equation (42.105b), so the modified SM bivectors 
are imaginary in an orthonormal basis, and have grade 4 or 8, which is 0 mod 4. The proposed algebra of 
spacetime and SM generators passes the consistency test. 

It is worth remarking that the conditions (39.157) on the grades of multivector generators, combined 
with the commutation rules of the Dirac and SM algebras, impose that spacetime vectors %m must be odd 
multivectors, while Lorentz and SM generators must be even multivectors. The algebra indeed satisfies these 
conditions. 


42.4.9 Uniqueness 


How unique are the identifications (42.115) between the spacetime vectors Ym and the Spin(11, 1) multivec- 
tors on the right hand side? 

Consider multiplying each vector %m by some Spin(11, 1) multivector Xm. Any such multivector Xm must 
preserve all SM charges, which means that Xm must commute with all SM generators modified per (42.113). 
Moreover, since spacetime vectors y%m must be odd, §42.4.8, Xm must be even. This limits each Xm to 
Adu, 4rgb, S02, VET: or some product thereof. The modified vectors XmYm must preserve the standard 
Dirac commutation relations between them. Define (—)mn to be the sign of the commutation of Xm with 
Yn, that is XmYn = (—)mnYnXm, and let (X)mn be the sign of the commutation of Xm with Xn, that 
is XmXn = (X)mnXnXm.- Preservation of the commutation rules between pairs ym and Yn of spacetime 
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vectors requires 
(—)mmXm=1 m=n, (42.117a) 


The condition (42.117a) can always be accomplished by adjusting the phase of Xm, so imposes no constraint. 

The most stringent condition on the algebra is that the Dirac pseudoscalar I = yoy1Y2Y3 should coin- 
cide with either the Spin(10) pseudoscalar Jio, equation (42.97), or with the Spin(11,1) pseudoscalar J, 
equation (42.108), 


IES Tho or J. (42.118) 


The condition (42.120) emerges from the observational fact that the Dirac pseudoscalar coincides with the 
Spin(10) pseudoscalar J10, equation (42.97). 

Modifications that merely swap Yo © y3 (multiply yo and y3 by aur) or yı © Y2 (multiply yı and y2 
by 2rgp) may be discarded as leaving the algebra essentially unchanged. 

or that accomplish any of the following relabellings of Spin(11, 1) multivectors, may be discarded as leaving 
the algebra essentially unchanged: 


Wee, Wwe wk: WU OREN - (42.119) 


Motivated by the arguments in §42.4.5, impose the conditions that the Dirac pseudoscalar I = yo7y17y2°¥3 
coincides with the Spin(11,1) pseudoscalar J, equation (42.108), and that the boost generator o3 = Y0o%3 
coincides with either Ig, or —ilaur, equations (42.105a) or (42.107), 


I=J , 03 = lau or — laut è (42.120) 


It turns out that there are no solutions with o3 = Ig,, so o3 = —ilqur is required. An exhaustive com- 
puter search of possibilities shows that, if relabellings (42.119) are set aside, and if the conditions (42.120) 
are imposed, then besides the choice (42.115) there is just one other choice, obtained by multiplying the 
expressions on the right hand sides of equations (42.115) for yo and y3 by the colour pseudoscalar Igb, 
equation (42.105b), and for yı and y2 by the T-chiral operator xr = iYEYT- All that can be said about 
this second choice is that it is less elegant than the first choice (42.115). Except that second choice misses 
YF AYÈ bivectors.. 


42.4.10 Electroweak Higgs field 


The Uy (1) x SUL (2) theory of electroweak interactions in the SM is called the Weinberg-Salam theory (Salam 
and Ward, 1959; Weinberg, 1967), for which Glashow, Salam, and Weinberg shared the 1979 Nobel prize. 
The mechanism by which the electroweak symmetry is broken to the electromagnetic symmetry Uem(1) was 
proposed by Weinberg (1967), who invoked the so-called Higgs mechanism (Englert and Brout, 1964; Higgs, 
1964; Guralnik, Hagen, and Kibble, 1964). The exposition of the electroweak Higgs mechanism that follows 
leans on Peskin and Schroeder (1995, Ch. 20). 

The Higgs mechanism posits a mysterious Higgs field that accomplishes four things: it breaks a gauge 
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symmetry; it gives masses to fundamental fermions; it gives masses to some gauge bosons; and it generates 
a massive spin 0 particle. The Higgs field achieves these outcomes through the peculiar property that it has 
a finite value in the Minkowski vacuum. This contrasts with fermionic and gauge fields, which vanish in the 
empty vacuum of Minkowski space. The Higgs field must be a Lorentz scalar to allow it to have a non-zero 
expectation value in the vacuum. If instead the Higgs field were for example a Lorentz spinor or vector, then 
its presence would define a preferred direction and rest frame, contradicting the observed Lorentz symmetry 
of the laws of physics. The Higgs field could potentially be a composite particle (though that is not argued 
here), but that composite particle must still have spin 0. 

The electroweak Higgs field breaks the d-symmetry of the SM. It does so by carrying a finite d-charge, and 
zero other SM charges urgb. The Higgs field gives masses to fundamental fermions by flipping their d-bit 
between up and down. And the Higgs field gives masses to 3 of the 4 weak gauge bosons of Uy (1) x SUL(2), 
the so-called charged W= and neutral Z weak gauge bosons. The 4th gauge boson, the photon y, remains 
massless. The electroweak Higgs field gives masses to gauge bosons by virtue of being part of a multiplet 
of 4 Higgs scalar fields that transform under Uy(1) x SU,(2). The 1+3 = 4 gauge bosons of the unbroken 
electroweak symmetry Uy(1) x SU, (2) are natively massless. Each massless gauge boson has just 2 degrees 


of freedom, its spin in the directions transverse to its direction of motion. To become massive, a gauge boson 
must gain a 3rd degree of freedom, corresponding to a longitudinal spin along the direction of motion. When 
the Higgs field acquires a vacuum expectation value along a special direction, the 3 degrees of freedom of 
the Higgs field orthogonal to the special direction morph into longitudinal degrees of freedom of the gauge 
bosons, giving 3 gauge bosons their mass. The remaining 1 degree of freedom of the Higgs field becomes a 
massive particle, the Higgs scalar boson. A particle with properties consistent with being the Higgs boson, 
with a mass of 125 GeV, was discovered in 2012 by the CMS and ATLAS collaborations at the Large Hadron 
Collider (Chatrchyan et al., 2012; Aad et al., 2012). 

What makes the Weinberg theory of electroweak symmetry breaking especially compelling is that it pre- 
dicts a relation between the ratio gy /g of hypercharge and weak coupling constants, and the ratio mz/my 
of the masses of Z and W gauge bosons, a relation that is experimentally well satisfied. The relation is 

gy 

Iw 
where ĝu is the weak mixing angle, or Weinberg angle. The NIST 2018 CODATA recommended value of the 
weak mixing angle is (NIST, 2018) 


= tanfu , Wa COS Ow , (42.121) 
mz 


sin”, = 0.2229 + 0.0003 . (42.122) 


In the present context, the Higgs field must be identified with a multivector that flips the d-bit. To 
preserve Poincaré symmetry, the Higgs field must commute with all the spacetime vectors Ym given by 
equations (42.115). An exhaustive search over multivectors concludes that the largest subgroup of Spin(11, 1) 
that commutes with the Poincaré group is the group 


Spin(5) x Spin(6) . (42.123) 


Here the generators of Spin(5) are the 10 bivectors drawn from the 5 vectors consisting of the 4 electroweak 
vectors y7, i = d,u along with the 1 vector yt- The generators of Spin(6) are the 15 bivectors drawn from 


1080 The Standard Model of Physics and beyond 


the 6 colour vectors y; with i = r,g,b. The subalgebra of the Spin(11, 1) geometric algebra that commutes 
with the Poincaré algebra is the algebra generated by Spin(5) x Spin(6) bivectors and their products (all of 
which are even multivectors in the Spin(11,1) geometric algebra). 

The 4 bivector generators yEyE with i = d, u call attention to themselves because they transform spinors 
by one unit of SM charge d or u, whereas the remaining 6 + 15 = 21 bivector generators, which generate 
the Pati-Salam group (42.8), transform spinors by an even number of SM charges. The Weinberg theory 
requires the electroweak Higgs field to be part of a multiplet of 4 fields that transform into each other under 
Uy (1) x SU, (2). Indeed the 4 bivector generators y;y~ with i = d,u provide precisely such a set of fields. 
Define therefore the 4-component Higgs field H by 


H=H 77, a=d',d,utiu. (42.124) 


Electroweak symmetry breaking occurs when the Higgs field acquires a vacuum expectation value propor- 
tional to Vie 


(H) =(H)yare - (42.125) 


When combined with the time axis —iyo = yp in a fermion mass term Y- My = -ivtyMy, the vacuum 
Higgs field (42.125) yields a Dirac mass term proportional to 


VIE >» (42.126) 


consistent with the Dirac mass terms in equation (42.185). The Higgs field (42.125) is proportional to y7 yp 
not vive because ył preserves the spinor identity, whereas yt flips between spinor and antispinor?. 

In the standard approach to spontaneous symmetry breaking in Spin(10) (Croon et al., 2019), the Higgs 
field must be part of a Spin(10) multiplet in order that its Lagrangian be invariant under Spin(10). The 
standard approach is premised on the assumption that the Poincaré and Spin(10) algebras commute, which 
is not true in the present, construction; rather, the Poincaré and SM algebras here are commuting subalgebras 
of the Spin(11, 1) geometric algebra. In the standard approach the Higgs field must be an odd Spin(10) multi- 
vector (because it flips only 1 bit, the d-bit), so must be a vector or pseudovector (dimension 10), trivector or 
pseudotrivector (dimension 120), or pentavector (2 possibilities, a pentavector or a pseudopentavector, each 
of dimension 126). In the present construction, a multiplet of fields with properties matching the electroweak 
Higgs fields is present without having to be introduced ad hoc. 

For a spinor field w, the gauge-covariant derivative with respect to Uy (1) x SUL(2) transformations is 


Din = (Om + Gy Bm + IwWm) Y , (42.128) 


2 For example, an electron e and positron Z at rest are linear combinations e = (eg — ieg)/W2 and é = (ez + iea)/(V2i) of 
d-down and d-up spinors eg and eg. The bivector y} acting on the electron leaves the electron unchanged, while yt flips 


the electron to its positron partner (note that yq and yg multiply by /2 while raising and lowering the d-bit of their 
argument, equations (38.111)): 


_eg-—taq  Ya-YVa eq~ tea  eg— tea 
= = F 42.127a 
Ya V2 V2i V2 J2 ( ) 


4eg—Ma _ Yat Yg eg— ta _ egttea 


u- a Z A Vii 


(42.127b) 
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where Bm and Wm are the Uy(1) and SU, (2) gauge fields 
Bm =iBmY , Wm =iWin , (42.129) 


and gy and gu are dimensionless coupling strengths for those fields. Here iY , equation (42.31), is the generator 
of the hypercharge symmetry Uy (1), while the weak Pauli matrices i7;, equations (42.30), are generators of 
SU, (2). The weak Pauli matrix 73 acting on a spinor has eigenvalue equal to twice the isospin 21y, = u — d, 
equation (42.13b). The electromagnetic charge generator iQ, equation (42.32), is related to the hypercharge 
and weak generators iY and i73 by, equation (42.14), 


Q=iY +n). (42.130) 
The sum WŻ,T; in the gauge field Wm, equation (42.129), can be expressed with respect to either an or- 
thonormal or a chiral basis, 
Wr F iW? 


Win = Wont Wen + Wer = Wiry +Wzr- +W r, W= E ; (42.131) 


where the chiral Pauli operators T+ are 


m UEI _ WwAYVd 7 Rin _ YN a 
t y Vo v2 Vo 


The operator 71 increases u-charge by 1 and decreases d-charge by 1, and therefore carries +1 unit of each 


(42.132) 


of electric charge Q and isospin I. Conversely, r- decreases u-charge by 1 and increases d-charge by 1, and 
therefore carries —1 unit of each of electric charge Q and isospin J. The operators Y and 73 leave d- and 
u-charge unchanged, so carry zero electric charge Q and isospin Iz. 

Introduce the weak mixing, or Weinberg, angle 6, defined by 


sin by = 2% , cosh, = oa g= 4/9 +92. (42.133) 
g g 


Define the electromagnetic and weak fields Am and Zm to be the orthogonal linear combinations of Bm and 


W3, 
Am \_ cosÎw sin Ay, Bm 
( Zm ) 7 ( — sinw cos bw ) ( w3, ) 3 (42.134) 


In terms of the electromagnetic and weak fields Am and Zm, the electroweak gauge connection is 
9y Bm + uWin = i(2eAmQ + 29Zm (I, — sin?6Q) + gu (Wå r+ + Want) 5 (42.135) 


where the electromagnetic coupling e is 
— gY Jw 
g 


The particular orthogonal combination (42.134) is chosen because the electric charge operator Q commutes 


e = gy COSOy = Jw SIN Oy = g cos Âw sin Ôw . (42.136) 


with the vacuum Higgs field (42.125), with the consequence that the vacuum Higgs field generates a mass 
for the electroweak field Zm, but leaves the electromagnetic field Am massless. 
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Figure 42.3 Mexican hat quartic potential V of a Higgs field of magnitude H. 


The gauge-covariant derivative of the 4-component Higgs field H with respect to Uy (1) x SUr (2) trans- 
formations is 


DmH = OmH + gy |Bm, H) + gu[Wm, H] . (42.137) 


Whereas in the covariant derivative (42.128) of a spinor %, the fields Bm and Wm act directly on the 
spinor, in the covariant derivative (42.137) of the Higgs field H, the fields act as a commutator, because 
whereas a spinor transforms as p —> Rw under a rotor R, a multivector such as the Higgs field transforms 
as H + RHR. 

The Lagrangian Ly of the 4-component Higgs field H is 


Ly = — 4(D” H): (DmHĦH)-V(H. H), (42.138) 


where Dm H is the gauge-covariant derivative (42.137) of the Higgs field, and V (H - H) is a potential energy, 
a function of the scalar product H? = H . H of the Higgs field H and its reverse H. The potential energy V 
is postulated to have a minimum at a non-zero value of H?, which serves to make it energetically favourable 
for the Higgs field to acquire a non-zero expectation value. A commonly adopted potential, with the virtue 
of yielding a renormalizable quantum field theory, is a “Mexican hat” quartic, illustrated in Figure 42.3, 


V(H?) = pu — 4m} H? + $AH . (42.139) 


The Higgs field H has units of mass, and the potential V has units of masst, or energy density. The con- 
stant V(0) = pp looks like a vacuum density that could play the role of a cosmological constant before 
electroweak symmetry breaking, when H = 0. The quantity my proves to be the mass of the Higgs bo- 
son, equation (42.153). The minimum of the potential V defines the vacuum expectation value (H) of the 
magnitude of the Higgs field, 


(H) =4/ ZE , (42.140) 
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The covariant derivative of the expectation value (42.125) of the Higgs field is 
D(H) = (H) (gy BmliY, ya VF] + GwWmnlitis Ya Ve) - (42.141) 


The relevant commutators of the generators iY of Uy(1), equation (42.31), and ir; of SU ,(2), equa- 
tions (42.30), with the electroweak Higgs field y7 y$ are 


EY y= N= N» (42.142a) 
lin Itil =- NINES viet, (42.142b) 
lit, Va Vr] = -V1 Va Var = VaT, (42.142c) 
lits, Ya VF] = -V Ya Va Ve = -VV - (42.142d) 
With the commutators (42.142), the covariant derivative (42.141) becomes 
Dm (H) = (H)((9y Bm — WEINE VE + Io Why re + Wve) 
= (H)(-92 mb VF + Io (Wikre + Wintert)) - (42.143) 


The covariant derivative (42.143) squared, which enters the Higgs Lagrangian (42.138), is (abbreviating 
Z™ Zm = (Zm)? and so forth) 


(D"(H)) - (Dm(H)) = (H)? (9?(Zm)? + 92, (W4)? + (Win)?)) - (42.144) 
An originally massless field acquires mass when its Lagrangian is modified so that the d’Alembertian in the 
equation of motion is modified to O + O — m?. In the case of a gauge field such as Zm, the modification of 
the Lagrangian that gives Zm a mass mz is 


AL = —$M3(Zm)? . (42.145) 


The contribution (42.144) to the Lagrangian has the form of mass squared terms for the Zm and WĀ 
electroweak fields. The Higgs field thus generates masses mz and my for the Zm and W% fields, 


mz =g(H), mw = 9w(A) . (42.146) 


The masses (42.146) along with the definition (42.133) of the weak mixing angle @,, imply the relations (42.121). 
The electromagnetic field Am remains massless. In accordance with the remarks after equations (42.132), 
the electromagnetic field A,,, and weak field Zm both carry zero electric charge and isospin, while the weak 
fields W carry respectively +1 unit of each of electric charge Q and isospin T. 

Having acquired a non-zero expectation value, the reconfigured 4-component Higgs field generates a single 
massive spin 0 particle, the Higgs boson. As emphasized above, central to the behaviour of the Higgs field 
is that its components rotate into each other under Uy(1) x SUL(2) transformations, equations (42.142). 
Therefore the Higgs field H can be written as a product of a unitary rotation U in Uy(1) x SU,(2) anda 
Higgs field Ho pointed in a certain direction, which can be taken to be the broken direction (42.125), 


H=UH), Ho = Hyj 7} - (42.147) 
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By definition, the gauge-covariant derivative of the Higgs field transforms under the unitary rotation U as 
DmH = UDmHo . (42.148) 


The Higgs Lagrangian (42.138) is, by construction, invariant under gauge transformations, so in terms of the 
(un)rotated Higgs fields Ho is 


Ly = — 4(D” Ho): (DmHo) — V (Ho - Ho) . (42.149) 
Define the perturbation h of the magnitude H of the Higgs field by 
h=H-(H). (42.150) 


In terms of h, the potential V(H?), equation (42.139), in the Higgs Lagrangian (42.149) is (note that 
A = m?2,/(H)? from equation (42.140)) 


h \?2 
V=gmyh? (1+—_] . 42.151 
The potential vanishes at h = 0 provided that the constant term in equation (42.139) is py = gm3,(H)?. 


The Higgs Lagrangian (42.149) in terms of the perturbation h is 


7 h \2 h \2 
j= (o h)(Omh) + (m3 (Zm)? + my (Wt)? + (W;.)?)) (1 $ m i mah? (1 + xm) ) 
(42.152) 
If the Zm and W mass terms are set aside, then to lowest order in h the Lagrangian (42.152) looks like the 


m 


Lagrangian of a free scalar field of mass my, 
ies —3((2"2)Onh) + mish?) (42.153) 


The interpretation is that h describes a spin 0 field of mass my, the Higgs boson. Other terms proportional 
to powers of h in the Higgs Lagrangian (42.152) describe interactions between the Higgs boson h and the 
weak gauge fields, and self interactions of the Higgs boson. 


42.4.11 Vector versus scalar: gauge versus Higgs fields 


The previous section 42.4.10 argued that the electroweak Higgs bivectors (42.124) are among the generators 
of the group Spin(5) x Spin(6) that contains the SM group Uy(1) x SU,(2) x SU(3) and is the largest 
subgroup of Spin(11, 1) that commutes with the Poincaré group generated by the multivectors (42.115). Yet 
the gauge fields of the SM are Lorentz vectors, while the electroweak Higgs fields are Lorentz scalars. Does 
that make any sense? 

On the one hand, if spinors satisfy a local gauge symmetry, then the associated gauge field arises as 
a connection in a gauge-covariant derivative, and must be a Lorentz vector. On the other hand, an elec- 
troweak Higgs field that acquires a non-zero vacuum expectation value must be a scalar, since otherwise it 
would impose a preferred spatial direction and rest frame, breaking Lorentz symmetry in contradiction to 
observation. 
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The issue is salient here because the next section 42.4.12 explores how the Spin(5) x Spin(6) symmetry 
broke down to the observed SM symmetry. The electroweak Higgs multiplet is a scalar after electroweak 
symmetry breaking, but could it have been a vector before symmetry breaking? If the Spin(5) x Spin(6) 
symmetry was restored, then the Higgs bivectors, being among the generators of that symmetry, must have 
been vectors. Or did the Higgs bivectors remain scalars, in which case symmetry restoration would stop short 
at the Pati-Salam group Spin(4) x Spin(6)? 

Can a vector field somehow transition into a scalar field? In the conventional picture where the GUT and 
Lorentz groups are distinct, the spin of a field is a conserved charge associated with Lorentz symmetry, and it 
is natural to assume that spin remains an immutable property of a field through GUT symmetry breaking. But 
in the present construction, at least some aspects of symmetry breaking are associated with reconfiguration of 
spacetime rather than with Higgs fields. For example, SU(5) is broken because its generators fail to commute 
with the Lorentz generators —ilaur and Ig. It is worth remarking that in string theory the dimensionality 
of objects is not immutable. 


The next section 42.4.12 will argue that Spin(5) x Spin(6) is broken by a Higgs field that is the generator 
of a U(1) subgroup of Spin(5) x Spin(6). Similarly to the electroweak Higgs fields, the U(1) bivector calls 
attention to itself because it happens to be a generator of the Spin(5) x Spin(6) group that commutes with the 
Poincaré group, and it happens to have precisely the properties needed to break Spin(5) x Spin(6) down to 
the SM group. The posited U(1) Higgs field has the additional merits that: (1) it removes the baryon-lepton 
symmetry group Ug_,(1), notably absent from the SM group, from being a symmetry of the SM; and (2) 
it generates fermionic mass terms distinct from those generated by the electroweak Higgs fields, mass terms 
that are needed to fill out the fermionic mass matrix in the presence of the T-bit in a manner consistent 
with observation, §42.4.14. 


The U(1) Higgs field that breaks Spin(5) x Spin(6) symmetry remains present today, and like the elec- 
troweak Higgs field must be a scalar to preserve the observed Lorentz symmetry of spacetime. If the 
Spin(5) x Spin(6) symmetry was restored before being broken, then the U(1) Higgs field, being one of 
the generators of Spin(5) x Spin(6), must have been a vector. Conversely, if the U(1) Higgs field remained a 
scalar before Spin(5) x Spin(6) symmetry breaking, then the restored group cannot have included the U(1) 
factor. But one of the consequences of the Cartan-Weyl-Dynkin theory discussed in §42.2 is that finitely 
generated Lie groups are direct products of irreducible groups (Maschke’s theorem), so the restored group 
must in fact have commuted with the U(1) Higgs factor. But the subgroup of Spin(5) x Spin(6) that com- 
mutes with the U(1) factor is precisely the SM group (this is the property that picks out the U(1) Higgs 
field in the first place). Thus the U(1) Higgs field cannot break Spin(5) x Spin(6) symmetry if it was a 
scalar before it broke the symmetry. Could it be that there was both a scalar and a vector U(1) field before 
Spin(5) x Spin(6) symmetry breaking, the former to break the symmetry, the latter to be a generator of the 
unbroken symmetry? This option is excluded because the U(1) scalar field would commute with the U(1) 
vector field, so the U(1) Higgs scalar would leave the U(1) gauge symmetry unbroken, contradicting the SM. 


The conclusion is that the U(1) Higgs field that breaks Spin(5) x Spin(6) symmetry must transition from 
vector to scalar at Spin(5) x Spin(6) symmetry breaking. It is natural then to suppose that the electroweak 
Higgs field likewise transitioned from vector to scalar at electroweak symmetry breaking. 


1086 The Standard Model of Physics and beyond 


42.4.12 The Higgs field that breaks Spin(5) x Spin(6) symmetry 


As remarked in §42.4.10, the largest subgroup of Spin(11,1) that commutes with the Poincaré group is the 
group (42.123), Spin(5) x Spin(6). How does the group Spin(5) x Spin(6) break down to the observed SM 
group Uy(1) x SUL(2) x SU(3)? 

With two exceptions, every generator of Spin(5) x Spin(6) that preserves the number of durgb up bits is a 
generator of the SM, and every generator of Spin(5) x Spin(6) that does not preserve the number of up bits 
is not a generator of the SM. The exceptions are the generator iR of an overall phase transformation Up(1) 
of the du subgroup Spin(4) of Spin(5), and the generator 7S of an overall phase transformation Ug(1) of the 


rgb group Spin(6), 
RSI YO aay SEY uana, S= YM aay =i Y anda. (42.154) 


i=d,u i=d,u i=r,g,b i=r,g,b 
The bivector R equals the third Pauli generator of the right-handed weak group SUpr(2). R-charge and 
S-charge are related to Spin(10) durgb charges by 


R=d+u, S = 2(rio + gio + bio) = —(B-L). (42.155) 


The expression for S in terms of the baryon-lepton difference B — L is from equation (42.17). In terms of 
the bivectors R and S, hypercharge Y is, equation (42.31), 


Y=R-S. (42.156) 


It is natural to hypothesize that some linear combination E (in honour of Englert and Brout (1964), who 
proposed the Higgs mechanism marginally before Higgs (1964)) of the bivectors R and S is a Higgs field, 
that is, E acquires a non-zero expectation value (E) in the Minkowski vacuum. As long as the coefficients 
of both R and S in (E) are non-zero, then, excepting R and S themselves, the vacuum Higgs field (E) non- 
commutes with non-SM generators of Spin(5) x Spin(6), thereby giving mass to the associated gauge fields, 
while commuting with all SM generators. Only a single combination of R and S can be a Higgs field; if both 
Rand S were Higgs fields separately, then the hypercharge symmetry Uy (1) would be broken, contradicting 
the SM. If E is a Higgs field that acquires a non-zero vacuum expection (E), then it: spontaneously breaks 
Spin(5) x Spin(6) to the SM group Uy(1) x SUL(2) x SU(3); removes the Ug(1) symmetry from being a 
symmetry of the SM; gives masses to gauge fields that are in Spin(5) x Spin(6) but not in the SM; provides 
another way, besides Dirac mass, to give masses to fermions, as discussed in §42.4.14 below; and it generates 
a massive spin 0 Higgs boson. 

The Spin(5) x Spin(6) gauge connection is 

Jw Wm +9GcCm ; (42.157) 
where Wm are gauge fields of Spin(5), and C,, are gauge fields of Spin(6), and gw and ge are dimension- 
less coupling parameters of the Spin(5) and Spin(6) groups. The groups Ur(1) and Ug(1) are subgroups 
respectively of Spin(5) and Spin(6). The part of the Spin(5) x Spin(6) connection (42.157) associated with 
the Ur(1) x Us(1) symmetry is 


i (guW2R+9-C>8) , (42.158) 
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where W,® and C$, are the corresponding connection coefficients. The hypothesis is that some combination 
Em of the fields WẸ and CS, acquires a non-zero vacuum expectation value, while leaving the hypercharge 
symmetry Uy(1) unbroken. To achieve this goal, define rotated gauge fields Bm and Em by 


Bm \ _ / cos@5— —sin 656 WE 
( Em ) = ( sin O56 cos O56 ce : (42150) 
where the Spin(5) x Spin(6) mixing angle 656 is defined by 
sin O56 = a , COsdsg = r , JE=vVg9 +92. (42.160) 


In terms of the rotated fields Bm and Em, the Ug(1) x Us(1) connection (42.158) is 


i (Ju WRR + geC3 S) = i (gy BmY + gEm(sin?056 R + cos?056.S)) = i (gy BmY + gEm(S + sin?ðs6Y)) , 
(42.161) 
where the hypercharge coupling parameter gy is 


gy = Jw COS O56 = ge Sin O56 = g cos O56 Sin Ose . (42.162) 
The term proportional to BmY in the connection (42.161) has the correct form for the Uy (1) hypercharge 


connection. The hypercharge, weak, and colour coupling parameters are predicted to be related by 


ggy 
Jwe 
In renormalization theory the coupling parameters vary with the energy at which they are probed. The 


Sj, (42.163) 


condition (42.163) is interpreted in the next section 42.4.13 as determining the energy scale of Spin(5) x 
Spin(6) symmetry breaking, which proves to be ~ 10!? GeV. 

The term proportional to Em in the connection (42.161) must be interpreted as the Higgs field. To make 
this work, it is necessary to assume that Em ceases to be a Lorentz vector field, and instead becomes a 
Lorentz scalar field Æ. This is essential because any field that acquires a non-zero vacuum expectation value 
must be a Lorentz scalar, to avoid destroying Lorentz symmetry. As discussed in §42.4.11, the transition 
from vector to scalar is a logical necessity. Define therefore the Higgs field E to be, per the term proportional 
to Em in the connection (42.161), 


E = iE(sin?056 R + cos056 9) . (42.164) 


The magnitude E acquires a non-zero expectation value (EZ) in the Minkowski vacuum. The Spin(5) x 
Spin(6) fields Wm and Cm with non-vanishing commutators with the E Higgs field (42.164) are the 12 fields 
comprising: first, the 4 electroweak Higgs fields (42.124); second, the 2 right-handed weak fields given by 
equations (42.28) with weak indices i, j = d, u; and third, the 6 leptoquark fields given by equations (42.28) 
with 7,7 drawn from pairs of colour indices r,g,b. The four electroweak Higgs fields carry one unit of d or u 
charge, and zero colour charge r, g, b. They transform right-handed leptons and quarks into their left-handed 
partners, such as er © eL, and their antiparticle versions. The two right-handed weak fields carry two units 
of d, u charge and zero colour charge r, g, b. They transform right-handed leptons and quarks into their right- 
handed weak partners, er © vp or dr © ur, and their antiparticle versions. Leptoquarks carry zero d or u 
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charge, and two units of r, g,b charge. They are called leptoquarks because they transform between leptons 
and quarks, d © e and u © v in both right- and left-handed versions, and in both particle and antiparticle 
versions. The non-vanishing commutators of the Spin(5) x Spin(6) fields with the component fields R and S 
of the Higgs field E are 


ifort, R] = Ey YE i=d,u, (42.165a 
ENG — WG) El =a ENN ij=du, (42.165b 
Sie + 77), Rla= ay top ay ij =du , (42.165¢ 
bay RGIS tye) iin rgd, (42.165d 
AGARA ae %) =F +H) ijin rgb. (42.165¢e 


The top line (42.165a) are commutators for the 4 electroweak Higgs fields (42.124); the second and third 
lines (42.165b) and (42.165c) are commutators for the 2 right-handed weak fields; and the fourth and fifth 
lines (42.165d) and (42.165e) are commutators for the 6 leptoquark fields. In each case the commutator of 
the field yields another field of the same species. The scalar product of each commutator with its reverse 
equals 1 for the top line (42.165a), 2 for the second and third lines (42.165c) and (42.165b), and Š for the 
bottom two lines (42.165d) and (42.165e). 

The covariant derivative of the expectation value (E) of the Higgs field (42.164) is 


Dim(E) = i(E) (gw sin?056[Wm, R] + ge cos”O56[Cm,S]) - (42.166) 


From the commutators (42.165), the square of the covariant derivative (42.166) is 
(D"(E)) - (Dm (E)) = 9?(E)? (sin®@s6 ((Hi,)? + 2(W1i)?) + 8 cos O56 (C?) , (42.167) 


where the fields on the right hand sides are the subset of the weak and colour fields Wm and Cm that fail to 
commute with the Higgs field E. The fields H?, are the 4 electroweak Higgs fields, with index i running over 
d,d, u, ū; the Higgs fields are taken here to be vectors in accordance with the argments in §42.4.11. The wii 
are the 2 right-handed weak fields, with index ij running over du, du. The cll are the 6 leptoquark fields, 
with index ij running over rg, gb, br, 7g, gb, br. The fields carry SM charges in accordance with their durgb 
indices. Equation (42.167) shows that the Higgs field E generates masses for the non-commuting fields, 


my = gsin?656(E) , mw = V2gsin°6s56(E) , mo = 2/29 cos" O56(E) . (42.168) 


The mass my is the mass of each of the 4 electroweak Higgs fields after Spin(5) x Spin(6) symmetry breaking 
but before electroweak symmetry breaking. This mass my is different from the mass my of the electroweak 
Higgs boson after electroweak symmetry breaking, because the masses are generated in different ways. The 
mass my is the mass of each of the 2 right-handed weak gauge bosons after Spin(5) x Spin(6) symmetry 
breaking; the mass is different from the mass of the 2 left-handed charged weak gauge bosons. The mass 
mc is the mass of each of the 6 leptoquark gauge bosons after Spin(5) x Spin(6) symmetry breaking. A 
prediction of the model is that the masses my, mw, and mc are related by 


V2my = mw = žmo tan? 456 P (42.169) 
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After Spin(5) x Spin(6) symmetry breaking but before electroweak symmetry breaking, the symmetry 
group is the usual SM group Uy(1) x SU,(2) x SU(3). The Spin(5) x Spin(6) gauge connection reduces to 
the SM connection 


igy BmY F Jw Wm g GcCm , (42.170) 
in which the fields that remain are the subset of Spin(5) x Spin(6) fields Wm and Cm that (aside from E 


itself) commute with the Higgs field E, and therefore remain massless and unbroken. The SM fields are the 
1 hypercharge field iBmY, the 3 left-handed weak fields Wm, and the 8 colour (gluon) fields Cm- 


42.4.13 Running of coupling parameters 


According to renormalization theory, to leading (one-loop) order, the coupling parameter g associated with 
a gauge group G varies with the log of the cutoff energy yz as (e.g. Peskin (1997, eq. (39)), or Schienbein 
et al. (2019, eq. (5.15))) 
dg~? = 1 
dlnu 167? 


(3 S»(G,adj) — 2ny$2(G, spinor) — 4nsS2(G,scalar) ) (42.171) 


where S2(G, rep) is the Dynkin index of the representation of the group, equation (42.39), and ny and ns are 
respectively the number of fermion and scalar multiplets that couple to the gauge group G. The normalization 
1/(167*) in equation (42.171) is a factor 4 that of Peskin (1997) or Schienbein et al. (2019); the Dynkin 
indices S2 here are correspondingly a factor of 2 times those of Peskin or Schienbein et al. The difference 
in normalization is a choice of units, as discussed for example around equation (42.74). The normalization 
adopted here and by Slansky (1981) corresponds to unit separation of charges on the charge lattice (blue 
line in Figure 42.1), whereas the normalization adopted by Peskin and Schienbein et al. corresponds to unit 
separation along a diagonal direction (black line in Figure 42.1). Technically, the Dynkin index S% is additive 
over distinct multiplets, so the fermion and scalar numbers nf and ns in equation (42.171) could be omitted; 
the numbers are included as a reminder to sum the Dynkin index over the multiplets that the group acts 
on. A particle and its antiparticle count as belonging to the same multiplet. 

The Lagrangian involves a product of a coupling parameter g and the associated charge operator; for 
example, the hypercharge Lagrangian involves the product gyY of the coupling parameter gy and the 
associated charge operator Y. Invariance of the Lagrangian requires the product to be invariant under 
rescaling charge, so the coupling parameter must scale inversely with charge. The units of equation (42.171) 
are thus consistent, as they must be: both sides have units of charge squared. 

According to Table 42.2, the Dynkin index of a multiplet in the adjoint or spinor representations relevant 
here is 


Sp(SU(N), adj) = 2N , S2 (SU(N), spinor) = 1 , (42.172a) 
S>(Spin(N), adj) = 2(N—2) , S2(Spin(V), spinor) = 2IN-D/2I-2 | (42.172b) 


The case of Uy(1), whose Dynkin index depends on a correct choice of units of the hypercharge Y, is ad- 
dressed in the next paragraph. The adjoint representation is the bivector representation, the multivector 
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representation of grade p = 2. Spin(N) has a unique spin representation, but SU(N) has spinor representa- 
tions of spinor grades p < N/2. The spinor representation of SU(N) given by equation (42.172a) is that for 
spinor grade p = 1, which is the only non-trivial spinor grade for the groups SU(2) and SU(3) considered 
here. 

The Dynkin index S2, equation (42.39), of a multiplet equals the trace of the square of an orthonormal 
generator, suitably normalized. The hypercharge group Uy(1) has a single generator, the hypercharge Y. 
The Dynkin index of Uy(1) that enters equation (42.171) equals the hypercharge squared summed over 
the particles of the SM, suitably normalized. The sum of squared hypercharges Y? of the 16 fermions in 
Table 42.1 is “2. By comparison, the sum over the third weak Pauli matrix squared 73 = (2/1) of the same 
16 fermions is 8. The Uy (1) Dynkin index $2(Uy (1), spinor) is therefore an average of “2/8 = 3 times that 
of SU, (2), per fermion. This is the factor 3 that enters equation (42.173a). Not coincidentally, the number 
3 equals the ratio of the length squared of the hypercharge generator Y to that of the weak generator 73 
on the lattice of charges, Figure 42.1. It is common practice to scale the hypercharge squared Y? by 3, and 
accordingly to scale the inverse coupling strength gy by 3, on the grounds that when unification occurs, 
the orthonormal generators should be normalized so their squares all have the same trace. This adjustment 
is not needed and not made here. 

In the SM, the left-handed weak group SU, (2) acts on 4 fermion multiplets, namely the (vy, eg) left-handed 
lepton multiplet, and the three (uf,, df) left-handed quark multiplets of colours c = r, g,b. The colour group 
SU(3) acts on 4 fermion multiplets, namely the left- and right-handed up and down quark multiplets uz, up, 
dL, and dy. Each fermion multiplet comes in 3 generations, so the number of fermions in equation (42.171) 
is nf = 4 x 3 = 12 for each of SUL(2) and SU(3). 

The weak group Spin(5), which unifies left- and right-handed fermions, acts on enlarged fermion multiplets 
that include both left- and right-handed weak chiral components, the lepton multiplet (vL, eL, VR, er), and 
the three quark multiplets (uf, df, ug, dR) of colours c = r,g,b. The colour group Spin(6) acts on enlarged 
fermion multiplets that contain leptons as well as quarks, (vL, uL), (vR, ur), (eL, dL), and (er, dr). In both 
Spin(5) and Spin(6), the number of fermion multiplets is still np = 4 x 3 = 12. 

However, T-bit doubling doubles the number np of fermion multiplets, as is evident in the chart (42.110), 
and as will be discussed further in §42.4.14. Each spinor comes in two species, with weak chirality respectively 
aligned and anti-aligned with its Dirac chirality, equations (42.174). To be consistent with observation, after 
electroweak symmetry breaking only spinors whose weak chirality aligns (approximately) with their Dirac 
chirality remain observably light, while spinors whose weak chirality anti-aligns with their Dirac chirality 
become unobservably massive, and do not contribute to the running of coupling parameters. Between elec- 
troweak and grand symmetry breaking, the number of fermion multiplets could vary from nf = 4 x 3 = 12 
(only light fermions contribute) to nf = 2 x 4 x 3 = 24 (all fermions, both light and massive, contribute). 
The number ny of fermion multiplets could increase incrementally between electroweak and grand symmetry 
breaking, as the energy scale u rises above the mass of each heavy fermion multiplet. The observed masses 
of fermions after electroweak symmetry breaking, Table 42.3, vary from tiny compared to the electroweak 
scale (neutrinos) to comparable to the electroweak scale (the top quark, at 173 GeV). The masses of heavy 
fermions could be similarly irregular. 

In the SM, there are various hypotheses for the “Higgs sector” prior to electroweak symmetry breaking, 
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Figure 42.4 (Left) Symmetry breaking of Spin(5) x Spin(6) to the standard model should occur where ggy /(gw9c) = 1. 
The running of coupling parameters with energy „u, equation (42.171), depends on the number ny of fermion multiplets, 
which in the present construction depends on the unknown masses of the massive fermions predicted to accompany 
the known light fermions, but should be between ny = 12 (only light fermions have energies < u) and nf = 24 (both 
light and heavy fermions have energies < p). The graph shows the running for both limiting cases nf = 12 and 
ny = 24, and (thicker line) an illustrative case where ny increases from 12 to 24 as the energy scale u increases from 
electroweak to grand symmetry breaking. The energy scale of Spin(5) x Spin(6) symmetry breaking is predicted to be 
u ~ 10}? GeV, with a factor of ~ 3 uncertainty from the uncertainty in np. (Right) Running of the standard-model 
coupling parameters gy, gw, and gc with renormalization energy scale u, equation (42.171), for the illustrative case 
where ny increases from 12 to 24 between electroweak and grand symmetry breaking (thick line in the left graph). The 
transition from the standard-model group Uy (1) x SUL (2) x SU(3) to Spin(5) x Spin(6) occurs at u & 6 x 1011 GeV. 
Regardless of nf, grand unification, in the sense that the weak and colour couplings gw and gc are equal, occurs at 
u ~ 3 x 1017 GeV. 


the common denominator being that there must be 4 real (or 2 complex) scalar fields that carry hypercharge 
and weak charge, and transform appropriately under Uy (1) x SU,(2). In the “minimal” model, the Higgs 
fields form a complex massless field that transforms as a spinor doublet under SU,(2), and as a scalar under 
Lorentz transformations. In this model, the number of scalars is ns = 1 for Uy(1) and SU; (2), and n, = 0 
for SU(3). 


In the present construction, as described in §42.4.11, the electroweak Higgs fields transition from being be- 
ing massive gauge fields before electroweak symmetry breaking to being massive scalar fields after electroweak 
symmetry breaking. There are never any light scalar fields, so the number of scalars in equation (42.171) is 
always n, = 0. 


In summary, the factor in parentheses on the right hand side of equation (42.171) for the running of 
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coupling pararameters is, for each of the groups relevant here, 


Uy (1) —2xixn, (42.173a) 
SUL(2): #x4—-Z2x1xn, , (42.173b) 
SU(3): #x6—2x1xn, , (42.173c) 
Spin(5): #x6-—2x1xn,, (42.173d) 
Spin(6): 4x8- 2x1ixnf. (42.173e) 


Strictly, the running of the hypercharge coupling gy should take in to account the actual hypercharges of 
the fermions whose masses fall below the running energy p, but the expression (42.173a) is adequate for the 
present purpose. 

The right panel of Figure 42.4 shows the running of the hypercharge, weak, and colour coupling parameters 
9Y, Jw, and ge as a function of the renormalization cutoff energy p, for an illustrative model in which 
the number of fermion multiplets ns increases from 12 at electroweak symmetry breaking to 24 at grand 
symmetry breaking. More precisely, in the model shown, the number of fermion multiplets increases in 
equally spaced increments of log from nf = 12 at electroweak symmetry breaking (u = 91 GeV, the Z- 
boson mass), to np = 18 at Spin(5) x Spin(6) symmetry breaking (u = 6 x 10!’ GeV), and then by further 
equally spaced increments of log u to np = 24 at grand symmetry breaking (u = 3 x 10!’ GeV). The left 
panel of Figure 42.4 shows the combination ggy/(gwgc), equation (42.163), which is predicted to be 1 at 
Spin(5) x Spin(6) symmetry breaking, for the above-mentioned model, as well as for limiting models where the 
number of fermion multiplets is constant between electroweak and grand symmetry breaking, with limiting 
values nf = 12 and 24. The condition (42.163) for Spin(5) x Spin(6) symmetry breaking occurs at an energy 
u ~ 10!? GeV to within a factor of 3. More precisely, for the three models shown, condition (42.163) holds 
at respectively 4 x 1011 GeV (ny = 12), 6 x 10'' GeV (ny increasing from 12 to 24), and 3 x 10!* GeV 
(nf = 24). The ratio mc/mw of masses of leptoquark to right-handed weak gauge bosons predicted by 
equations (42.169) are respectively 1.09, 1.07, and 0.96. 

Grand unification occurs where the weak and colour couplings coincide, gy = ge, which happens at 
an energy of u © 3 x 10!’ GeV irrespective of how the number nf of fermion multiplets varies between 
electroweak and grand symmetry breaking. The reason for the insensitivity to ny is that, according to the 
expressions (42.173), the running of couplings has the same dependence on ny for all four groups SU,(2), 
SU(3), Spin(5), and Spin(6). 


42.4.14 Fermion masses 


The T-bit emerged in §42.4.3 as a byproduct of enlarging Spin(10) to Spin(11, 1) in order to accommodate 
a time dimension. Adding the T-bit doubles the number of spinors from the 2° = 32 spinors of Spin(10) 
to 26 = 64, each spinor coming in T-bit-up and T-bit-down varieties. Thus the electron, for example, fills 
out 8 entries in the Spin(11, 1) chart (42.110), in place of the 4 entries in the Spin(10) chart (42.12). The 8 
electron components in the Spin(11, 1) chart (42.110) group into four 2-component electrons of various weak 
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and Dirac chiralities, 


err = {errt, erry} = {I rgb, du}, em = {ems, erry} = {dT rgb, u} , (42.174a) 
ert = {ertt, erty} ={rgb,duT}, em = {emt, einy} = {drgb, uT} . (42.174b) 


The first component of each 2-component electron in equations (42.174) has spin up (Ù), the second spin 
down ({). Electrons e; with left-handed weak chirality xq, have non-zero left-handed isopsin I, (d- and u-bits 
anti-aligned, equation (42.13b)) and experience the SU, (2) force, while electrons e, with right-handed weak 
chirality 2a, have zero left-handed isospin I, (d- and u-bits aligned), and do not experience the weak force. 
Weak chirality r or l is to be distinguished from Dirac chirality R or L, which in the present construction 
coincides with Spin(11, 1) chirality J, equation (42.116). Right- and left-handed Dirac or Spin(11, 1) chirality 
correspond to boost and spin respectively aligned and anti-aligned, R = {V+,U|} and L = {Uf, V4}. 

The Higgs fields discussed in sections 42.4.10 and 42.4.12 provide mass terms that link the 4 same- 
spin components of a fermion species. The electroweak Higgs field (42.125) generates Dirac mass terms 
mp and mq proportional to (H)yp x yg Y$ Yr that flip the d-bit. The Ug(1) Higgs field (42.164) that 
breaks Spin(5) x Spin(6) symmetry provides T-mass terms mr and m, proportional to (Ejyp that flip 
the T-bit. Besides the electroweak and Spin(5) x Spin(6) Higgs fields, there is the possibility of a scalar 
field generated by the unit multivector, which is allowed because it commutes with everything. The unit 
multivector generates a mass term proportional to yy, which like the Ug(1) Higgs field flips the T-bit. There 
are three regimes of energy, the three mass terms turning on successively as the energy decreases. Between 
grand and Spin(5) x Spin(6) symmetry breaking only the mass term generated by the unit multivector 
contributes; between Spin(5) x Spin(6) and electroweak symmetry breaking the unit-multivector and Ug(1) 
mass terms contribute; and after electroweak symmetry breaking all three mass terms contribute, the unit- 
multivector, Ug(1), and electroweak mass terms. 

The mass terms connecting the electron (for example) components in the chart (42.110) are 


mr] [rm (42.175) 


€rL “mp” EIR 


The upper and lower rows of the diagram (42.175) are for T-bit respectively up and down, while the left and 
right columns are for d-bit respectively down and up. 

Before electroweak symmetry breaking, when d charge is conserved, the Dirac mass terms vanish, and only 
mass terms that flip the T-bit are non-zero. The 8 components (42.174) of the electron split into two massive 
4-component species, a weakly interacting electron e; and a non-weakly interacting para-electron e,., each of 
which is an equal linear combination of components of opposite Dirac chirality but like weak chirality, 
aS eo (42.176a) 

v2 v2 
_ _ erR tere _ _ —tein + eL 


ge gp = Rte 42.176b 
é a E We ( ) 


er = 
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The Dirac chiral components R and L of each species are coupled by the mass terms mr and mų that flip the 
T-bit, in accordance with the diagram (42.175). The anti-electron eigenstates €» and &; are, modulo a phase, 
complex conjugates of the electron eigenstates ey and e;. Having distinct weak interactions, the masses of 
the non-weak and weak electrons e, and e; could be different. 

After electroweak symmetry breaking, d charge is not conserved, and the 8 electron components (42.174) 
are coupled not only by mass terms that flip the T-bit, but also by Dirac mass terms that flip the d-bit. The 
8-component electron has 4 mass eigenstates, which can be labelled as 2 electron eigenstates e+ with rest 


masses m+, and 2 anti-electron (positron) eigenstates €+ with rest masses —m+. To be consistent with the 


fact that only one species of electron is observed, the observed electron must be identified with the lighter 
mass eigenstate e_ with mass m_, while to elude observation the heavier mass eigenstate e} must have mass 
m+ greater than, possibly much greater than, the electroweak scale. The heavier mass mą must be much 
larger than the lighter mass m_, 


my >m. (42.177) 


The heavier mass electron eigenstate e} cannot be a second generation of electron (a muon or tauon), since 
as will be seen momentarily, equation (42.178b), the heavier electron has Dirac chirality opposite to its weak 
chirality, whereas a second generation of electron would, like the electron itself, have Dirac chirality equal to 
its weak chirality. 

Experiment establishes that the weak chirality of observed electrons (and other fundamental fermions) 
coincides with their Dirac chirality; that is, only left-handed (L) electrons (and right-handed (R) positrons) 
experience the weak force. The electron e_ and its positron partner €_, and the heavy electron states e+ 
and ē, orthogonal to them, must be 


—erR — teL EIR — 1€rL 
ago. gpa R 42.178a 
Fi W ( ) 
ago E A (42.178b) 


v2 v2 


The relations (42.178) are written as approximations, not equalities, to allow the possibility that there could 
be some small departure from exact equality of weak and Dirac chirality of light electrons. The heavy electron 
eigenstates (42.178b) have the same SM charges as the light electron eigenstates (42.178a), but the heavy 
electrons have Dirac chirality opposite to their weak chirality. 

To find the most general mass eigenstate of the 8-component electron (42.174), start with the fact that 
each mass eigenstate in its rest frame must be an equal linear combination of massless right- and left-handed 
Dirac chiral components e+p and e41, 


L a ew Cnet unr 


The anti-electron eigenstates €+ are, modulo a phase, complex conjugates of the electron eigenstates e+. 


The most general Dirac chiral eigenstate emx, with the mass index m running over + and —, and the chiral 
index X running over right- and left-handed Dirac chiralities R and L, is obtained by rotating weak and 
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Dirac chiral eigenstates ewx, with the weak index w running over weak chiralities r and l, by an element Rx 
of SU(2), a Pauli rotor, equation (13.120), 


Emx = Rxewx . (42.180) 


When the rotor Rx is resolved into rotations by 3 Euler angles, the initial and final Euler rotations about the 
3-axis can be absorbed into a rephasing of the components e,,x and ewx of the mass and weak eigenstates, 
reducing equation (42.180) to 


E€+R _ sin OR cos OR €rR eL — cos OL sin OL €rL (42 181) 
e-r /  \ —cosOp sindp er ) e- / \ —sinO, cosô, en J` i 
The relations (42.176) and approximations (42.178) indicate that 


Og = 5 and p =0 before electroweak symmetry breaking , (42.182a) 
Oz +0 and 6,0. after electroweak symmetry breaking . (42.182b) 


In full, the mass eigenstates e+ and é, are related to the weak chiral eigenstates ex by, from combining 
equations (42.179) and (42.181), 


iia. Ses sin OR cos@R —icos&, —isin dy nig 

m e_ 1 — cos Op sin ÔR isinO,  —icoséz EIR (42.183) 
—m_ e V2 icos@g —isinfr —sin Oz, cos Oy, erL l l 
=m4 e+ —isinOg —icosOg cosd, sin ôL eL 


The electron mass matrix M is by definition diagonal with respect to the mass eigenstates e+, €+. With 
respect to the Dirac chiral eigenstates e+x the Hermitian mass matrix M is given by, from equation (42.179), 


0 0 =m, 0 €4R 

= . . 0 0 0 —m_ eR 

e.: Me = —ielyyMe =i ( elk en ely el, ) me D 0 0 ene, (42.184) 
0 m- 0 0 eL 


Each of the 4 components emx (m = +, — and X = R, L) in equation (42.184) is itself a 2-component object 
with spin up (T) and down (1), equations (42.174). The mass matrix M is the same for both spins. With 
respect to the weak eigenstates ewx the mass matrix M is given by 


0 0 -MT —Ma erR 

= y 0 0 -mp mM EIR 

é-Me=i(etp ep ety eb, ) Bode a |, (42.185) 
Ma my 0 0 eiL 


where, from equations (42.181), 


mr mp \_ pt _ / cosł, —sin dy m}, 0 sinr  cosôR 
( Ma m ) = RimRRg = ( nt, canes ) ( 0 knee: eines ; (42.186) 
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giving 
mr =m, sin dg cos Oy + m- cos Ôr sind, , mp = M4 cos Ôr cos Oy — m_ sin dg sin dy , (42.187a) 
Ma = M4 sin Og sin 6, — m_ cos OR cos ôL , M = M4 cos Oz sin ĝu + m- sin OR cos Oy, . (42.187b) 


The masses mr and m; couple the T-up and T-down components, while the Dirac masses mp and mg couple 
d-up and d-down components. The masses satisfy 


mim- =mrm mpm, M| +m? =m +m + mhb tmMÊ. (42.188) 


Before electroweak symmetry breaking, d-charge is conserved, the Dirac masses mp and mz, are zero, while 
the T-flip masses mr and m, are the mass eigenvalues, 


mr=M;}; Mg =M- (42.189) 


The mass eigenstates are given by equations (42.176). 
After electroweak symmetry breaking, the mixing angles 0g and 6, are small, approximations (42.182b), 
and to leading order the T masses and Dirac masses are 


mr x mr +m-0L, mp *®m,—mM_ORO, , (42.190a) 


Ma X Mm ORO, =- M, me~ mh +M-OR. (42.190b) 
Given that m4 >> m_, the mass terms (42.190) satisfy the hierarchy 
My X mp > mr ~v m, > ma. (42.191) 


It is possible that the see-saw condition ma = 0 holds, in which case m_/m + = tan Op tan Oy ~ OROL. The 
vanishing of mq would mean that only the heavy-mass components of the electron are coupled by a Dirac 
mass term; the light-mass components are uncoupled. 

A priori, one might anticipate that the Dirac masses mg and mp would be close to the electroweak symme- 
try breaking scale of ~ 100 GeV, while the T masses m; and mr would be close to the Spin(5) x Spin(6) sym- 
metry breaking scale of ~ 101? GeV. That expectation is not realised here. Rather, the inequalities (42.191) 
require that the Dirac masses straddle the T masses. The best that can be said about this failure to meet 
expectations is that fermionic masses are one of the most mysterious ingredients of the standard model 
Quigg, 2007; for example, the lightness of the electron compared to the electroweak scale is unexplained. It 
is hard to declare the target missed when the target is a blur. 


42.4.15 Neutrino masses 


As described in §42.3.1, neutrinos cannot acquire their mass in the same way as the other fundamental 
fermions, because only left-handed neutrinos (and right-handed antineutrinos) are observed in Nature. The 
leading standard solution to the puzzle of neutrino masses is the see-saw mechanism, §42.3.1, and that 
remains the most promising solution in the present construction. The see-saw mechanism posits that, alone 
among fermions, a right-handed neutrino, having no conserved SM charge, has a Majorana mass that couples 
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it to its left-handed antineutrino partner. The see-saw mechanism holds that (after electroweak symmetry 
breaking) the neutrino has a Dirac mass like other fermions, but the Majorana mass Mm is much greater 
than the Dirac mass mg. The see-saw mechanism then predicts a heavy right-handed neutrino of mass m+ 
and a light left-handed neutrino of mass m_ satisfying the see-saw relation, equation (42.81), 
ma 
M} Mm, Mm x —. (42.192) 
Mm 
As with other fermions, section 42.4.14, adding the T-bit doubles the number of neutrino species, as 
illustrated in the chart (42.110). There are neutrinos whose weak chirality coincides with their Dirac chirality, 
and neutrinos whose weak chirality is opposite to their Dirac chirality. As with other fermions, the extra 
neutrino species do not comprise another generation; another generation would have weak and Dirac chirality 
equal, not opposite. 
After electroweak symmetry breaking, experiment establishes that the weak chirality of observed neutrinos 
coincides with their Dirac chirality. The diagram of neutrino mass couplings analogous to the electron 
diagram (42.175) after electroweak symmetry breaking is 


mp 
VL > VR Vir > Vri, 

[rm [mae . (42.193) 
* * * 
VIR ma’? Uri VIL mp’? rR 


The left diagram is for weakly-interacting neutrinos whose weak chirality coincides with their Dirac chirality 
(or, for antineutrinos, whose weak chirality opposes their Dirac chirality); the right diagram is for non-weakly- 
interacting neutrinos whose weak chirality opposes their Dirac chirality (or, for antineutrinos, whose weak 
chirality aligns with their Dirac chirality). The bottom rows of the two diagrams (42.193) are antiparticles 
of the top rows. The vertical arrows in the two diagrams (42.193), labelled Mmm and mm, are Majorana 
mass terms that connect the right-handed neutrino v, to its antineutrino partner. The left-handed neutrino 
vı cannot have a Majorana mass because it has a conserved standard-model charge. The fact that only 
left-handed neutrinos are observed precludes a T-mass term connecting nı, to vır. The horizontal arrows, 
labelled mg and mp, are Dirac mass terms. 

It should be emphasized that the present construction does not predict that the see-saw mechanism applies 
to neutrinos; rather, as in the standard model, the see-saw mechanism must be invoked to reconcile theory 
with experiment. The neutrino mass coupling diagram (42.193) could be over-simplified, or wrong. But if 
the diagram is correct, then neutrinos separate into two distinct species, the left and right diagrams, each of 
which has a mass matrix whose eigenvalues separately satisfy equation (42.81). 

The left diagram (42.193) includes the observed left-handed, weakly interacting neutrino vı, whose weak 
chirality coincides with its Dirac chirality. The standard see-saw mechanism posits that the Majorana mass 
is much larger than the Dirac mass, Mm >> ma, making the observed left-handed neutrino light, and the 
right-handed neutrino unobservably heavy, equations (42.192). 

The right diagram (42.193) predicts a second set of neutrinos none of which are observed. The set includes 
a weakly interacting neutrino vır whose weak chirality opposes its Dirac chirality. To be consistent with the 
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experimental constraint (42.78), the mass of this neutrino must exceed of order the electroweak scale, which 
requires that the Dirac mass mp be sufficiently large. The condition that the Dirac mass mp be large is 
reminiscent of the conditions (42.191) on non-neutrino fermions. 

Where does the Majorana mass term come from, in the present construction? In Dirac theory, the time axis 
‘Yo is diagonal acting on massive spinors in their rest frame, while the time axis’ Newman-Penrose partner 
y3 transforms massive spinors to their anti-spinor partners. Similarly, in the present construction the time 
axis Yo = iy preserves the standard-model charges of massive spinors in their rest frames, while its partner 
yt transforms massive spinors to their anti-spinor partners of opposite standard-model charge. So one way 
to construct a mass term that links neutrinos to their anti-neutrino partners is to replace the Ug(1) mass 
term (Eyr by (E)yf, or the unit-multivector mass term yz by y$. However, these mass terms are not 
Lorentz invariant. Lorentz-invariant mass terms that couple neutrinos and anti-neutrinos may be obtained 
by multiplying the standard Ug(1) or unit-multivector mass terms by the Spin(11, 1) pseudoscalar J defined 
by equation (42.108), 


J(E)yp or JIyp . (42.194) 


The exposition in this section 42.4.15 so far holds after electroweak symmetry breaking. What about 
before electroweak symmetry breaking? Dirac mass terms, which are generated by the electroweak Higgs 
field, cease to operate before electroweak symmetry breaking. Majorana mass terms (42.194) generated 
by the Ug(1) Higgs field and unit multivector continue to operate before electroweak symmetry breaking. 
However, the experimental evidence constrains the number of neutrino types at energies below electroweak 
symmetry breaking, equation (42.78), so the arrangement of neutrino mass couplings could differ from the 
diagram (42.193) before electroweak symmetry breaking. Neutrino mass couplings could perhaps resemble 
those of other fermions before electroweak symmetry breaking. 
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